Authors: Yoshua Bengio, Jérôme Louradour, Ronan Collobert, Jason Weston

Published year: 2009

Publication: ICML 2009

Abstract

Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones.

We formalize such training strategies in the context of machine learning, and call them “curriculum learning”.

The experiments show that

significant improvements in generalization can be achieved.

We hypothesize that

curriculum learning has both an effect

on the speed of convergence of the training process to a minimum and,

on the quality of the local minima obtained in the case of non-convex criteria.

Curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions)

The idea of training a learning machine with a curriculum is to

start small,

learn easier aspects of the task or easier subtasks, and then

gradually increase the difficulty level (Elman, 1993).

Hypothesis that helps to explain some of the advantages of a curriculum strategy

Curriculum strategies can help to find better local minima of a non-convex training criterion, and appear on the surface to operate like a regularizer.

Curriculum strategies can speed the convergence of training towards the global minimum of a convex training criterion.