Hyper-Parameter Tuning¶

Sometimes better to do in log-space, rather than linear space

		Disadvantage
Manual		Time-Consuming
Grid Search		Computationally-expensive
Random Search		Non-deterministic
Evolutionary	Randomization, Natural Selection, Mutation
Bayesian	Probabilistic model of relationship b/w cost function and hyper-parameters, using information gathered from trials
Gradient-Based	Treat hyper parameter tuning like parameter fitting
Early-Stopping	Focus resources on settings that look promising eg: Successive Halving

Speed Up¶

Plot cost function as function of no of clusters

Visualization		More than 3 hyperparameters	Simple
Contour		❌	✅
Parallel Coordinates		✅	❌

Last Updated: 2024-12-26 ; Contributors: AhmedThahir, web-flow