Hyper-Parameter Tuning¶
Advantage | Disadvantage | ||
---|---|---|---|
Manual | Time-Consuming | ||
Grid Search | Computationally-expensive | ||
Random Search | Non-deterministic | ||
Evolutionary | Randomization, Natural Selection, Mutation | ||
Bayesian | Probabilistic model of relationship b/w cost function and hyper-parameters, using information gathered from trials | ||
Gradient-Based | Treat hyper parameter tuning like parameter fitting | ||
Early-Stopping | Focus resources on settings that look promising eg: Successive Halving |
Speed Up¶
- Parallelizing
- Caching
- Random sampling: Wonβt work with caching
Clustering¶
Elbow Method¶
Plot cost function as function of no of clusters