In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization

Abstract

With the increasing computational costs associated with deep learning,automated hyperparameter optimization methods, strongly relying on black-boxBayesian optimization (BO), face limitations. Freeze-thaw BO offers a promisinggrey-box alternative, strategically allocating scarce resources incrementallyto different configurations. However, the frequent surrogate model updatesinherent to this approach pose challenges for existing methods, requiringretraining or fine-tuning their neural network surrogates online, introducingoverhead, instability, and hyper-hyperparameters. In this work, we proposeFT-PFN, a novel surrogate for Freeze-thaw style BO. FT-PFN is a prior-datafitted network (PFN) that leverages the transformers' in-context learningability to efficiently and reliably do Bayesian learning curve extrapolation ina single forward pass. Our empirical analysis across three benchmark suitesshows that the predictions made by FT-PFN are more accurate and 10-100 timesfaster than those of the deep Gaussian process and deep ensemble surrogatesused in previous work. Furthermore, we show that, when combined with our novelacquisition mechanism (MFPI-random), the resulting in-context freeze-thaw BOmethod (ifBO), yields new state-of-the-art performance in the same threefamilies of deep learning HPO benchmarks considered in prior work.

Quick Read (beta)

loading the full paper ...