top of page

Experiments on Lottery Ticket Hypothesis

Most modern neural networks optimize millions of parameters and the general consensus is that the bigger the network, the better. Even though there have been efforts to reduce the parameter count of these networks while preserving accuracy, most of these techniques start from a trained network and do not focus on decreasing the parameter-count during training.

 

The Lottery Ticket Hypothesis by Franklin and Carbin tells us that a trainable sparse network (winning ticket) can be found within a dense neural network. This means that a network with a given random initialization and the associated mask (structure) would give the same accuracy as the dense network. This is of great scientific interest as it sheds more light on how and why neural networks work the way they do. However, discovering a winning ticket is a computationally heavy process even for a relatively small network such as LeNet. In this work, we explored the possibilities of (a) finding a winning ticket faster for fully connected and convolutional neural networks, (b) finding a winning ticket for a subset of the dataset which performs equally well when retrained on the full network and (c) finding a winning ticket for a different neural architecture - namely ShuffleNet.

Trainability on Unseen Data

​We notice that a pruned network for one set of data is useful for another set of data with the same distribution. In other words, the trainability of a winning ticket is independent of the data it has seen before. 
 

1.png

​Faster Winning Ticket

We find that discovering a winning ticket is possible with a small fraction of the dataset instead of training on the whole dataset. This makes the process of finding a winning ticket much faster. 
 

2-1.png
2-2.png

Shufflenet

Also, we successfully find a winning ticket for ShuffleNet architecture. This is of interest because ShuffleNet makes use of batch normalization, depthwise separable convolutions and channel shuffle to achieve faster training and these techniques might respond to pruning differently.

3.png
Full Report
Source code
  • email
  • git2
  • scholar
  • Black LinkedIn Icon
  • Black Twitter Icon
  • Black Instagram Icon

© 2020 By Navami Kairanda.

bottom of page