Yoshua Bengio will give an introduction to the area of Deep Learning, to which he has been one of the leading contributors. It is aimed at learning representations of data, at multiple levels of abstraction. Current machine learning algorithms are highly dependent on feature engineering (manual design of the representation fed as input to a learner), and it would be of high practical value to design algorithms that can do good feature learning. The ideal features are disentangling the unknown underlying factors that generated the data. It has been shown both through theoretical arguments and empirical studies that deep architectures can generalize better than too shallow ones. Since a 2006 breakthrough, a variety of learning algorithms have been proposed for deep learning and feature learning, mostly based on unsupervised learning of representations, often by stacking single-level learning algorithms. Several of these algorithms are based on probabilistic models but interesting challenges arise to handle the intractability of the likelihood itself, and alternatives to maximum likelihoods have been successfully explored, including criteria based on purely geometric intuitions about manifolds and the concentration of probability mass that characterize many real-world learning tasks. Representation-learning algorithms are being applied to many tasks in computer vision, natural language processing, speech recognition and computational advertisement, and have won several international machine learning competitions, in particular thanks to their ability for transfer learning, i.e., to generalize to new settings and classes.