Learning Rate Finder¶
Learning rate finder plots lr vs loss relationship for a Learner
. The idea is to reduce the amount of guesswork on picking a good starting learning rate.
Overview:
- First run lr_find
learn.lr_find()
- Plot the learning rate vs loss
learn.recorder.plot()
- Pick a learning rate before it diverges then start training
Technical Details: (first described by Leslie Smith)
Train
Learner
over a few iterations. Start with a very lowstart_lr
and change it at each mini-batch until it reaches a very highend_lr
.Recorder
will record the loss at each iteration. Plot those losses against the learning rate to find the optimal value before it diverges.
Choosing a good learning rate¶
For a more intuitive explanation, please check out Sylvain Gugger's post
path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path)
def simple_learner(): return Learner(data, simple_cnn((3,16,16,2)), metrics=[accuracy])
learn = simple_learner()
First we run this command to launch the search:
learn.lr_find(stop_div=False, num_it=200)
Then we plot the loss versus the learning rates. We're interested in finding a good order of magnitude of learning rate, so we plot with a log scale.
learn.recorder.plot()
Then, we choose a value that is approximately in the middle of the sharpest downward slope. This is given as an indication by the LR Finder tool, so let's try 1e-2.
simple_learner().fit(2, 1e-2)
Don't just pick the minimum value from the plot!
learn = simple_learner()
simple_learner().fit(2, 1e-0)
Picking a value before the downward slope results in slow training:
learn = simple_learner()
simple_learner().fit(2, 1e-3)
Suggested LR¶
If you pass suggestion=True
in learn.recorder.plot
, you will see the point where the gardient is the steepest with a
red dot on the graph. We can use that point as a first guess for an LR.
learn.lr_find(stop_div=False, num_it=200)
learn.recorder.plot(suggestion=True)
You can access the corresponding learning rate like this:
min_grad_lr = learn.recorder.min_grad_lr
min_grad_lr
learn = simple_learner()
simple_learner().fit(2, min_grad_lr)
Callback methods¶
You don't call these yourself - they're called by fastai's Callback
system automatically to enable the class's functionality.