Transfer learning of image classification with deep learning architectures
LE3 .A278 2015
Master of Science
Deep learning architectures have advanced the state-of-the-art in many machine learn- ing applications such as computer vision, speech recognition, and natural language processing. However, deep learning architectures, like other machine learning meth- ods, cannot work well if there is a limited amount of training data. Transfer learning aims to use existing knowledge of previously learned tasks to help the learning of a new task; this can speed up learning and increase accuracy. Transfer learning ts well in deep learning architectures because of unsupervised feature learning, development of a rich hierarchy of features, and greater plasticity created by unsupervised learn- ing. Pretraining is a form of transfer learning even for single task learning. In this research, we compare several methods for transfer learning using deep learning archi- tectures, especially Deep Belief Networks. These include representational, functional and combined transfer. The domain of handwritten digits is used to train and evaluate these methods using reconstruction cross entropy, classi cation accuracy, and speed of learning. Empirical studies show that DBNs can develop better hidden node fea- tures, have better reconstruction cross entropy, and better classi cation accuracy than backpropagation networks. The most e cient (requires less time to train a model) DBN transfer learning method is representational transfer, and the most e ective (best classi cation accuracy and reconstruction cross entropy) DBN transfer learning method is functional transfer. Context sensitive multitask learning in DBNs produce better models compared with alternative transfer learning approaches. Combined transfer in DBNs can produce more accurate models than representational transfer and can learn faster than functional transfer approaches. However combined transfer produces less accurate models than functional transfer approaches on their own.
The author retains copyright in this thesis. Any substantial copying or any other actions that exceed fair dealing or other exceptions in the Copyright Act require the permission of the author.