An analysis of context-sensitive neural networks
LE3 .A278 2012
2012
Silver, Danny
Acadia University
Master of Science
Masters
Computer Science
In previous work, Poirier and Silver introduced csMTL networks, and demonstrated that these networks usually outperformed MTL networks for transfer learning. This thesis extends their work. Speci cally, we are interested in the reasons for csMTL's performance increases, and the identi cation of characteristics that any learning algorithm must have in order to take advantage of transfer from csMTL encoded data. Through the experimental collection of two metrics during network training, we provide insights on di erences between how transfer occurs in MTL and csMTL networks and demonstrate that csMTL networks do not become representationally equivalent to STL networks on the same domain of tasks. We show that the e ective number of weights in a csMTL network is less than the equivalent MTL network. Further to this, we show that the hidden node biases in csMTL networks are unnecessary. We articulate how these ndings partially explain the improvement in predictive accuracy of csMTL networks over MTL networks. Further, we demonstrate that removal of these biases can often improve the predictive accuracy of learned networks. Finally, we provide experimental evidence that, although transfer is dependent on both learning method and task domain, learning methods with high VC Dimension and which use latent variables (or hidden nodes) tend to perform well in transfer learning settings using csMTL encoded data.
The author retains copyright in this thesis. Any substantial copying or any other actions that exceed fair dealing or other exceptions in the Copyright Act require the permission of the author.
https://scholar.acadiau.ca/islandora/object/theses:254