Tim's Blog

## Distributions for Randomized SearchCV

A central part of data science is hyperparameter optimization, which can often be a difficult challenge to overcome. The process of hyperparameter optimization can be computationally expensive and prone to both underfitting and overfitting if not carried out well. However, one of the best methods that exist for hyperparameter optimization is a randomized search with cross-validation, like sklearn.model_selection.RandomizedSearchCV. This method is not nearly as computationally expensive as a grid search, and also provides a balance between overfitting and underfitting. The only issue in using randomized search is selecting a distribution that parameter values are to be drawn from. In this blog post, I will talk about one of the biggest issues with randomized search: sampling probability distributions on non-linear scales. Below, I propose a function to map random samples of a probability distribution from a linear scale to a logarithmic scale.