DualPrune: A Dual Purpose Pruning of Convolutional Neural Networks for Resource-Constrained Devices
Abstract:
Many successful applications of deep learning have been witnessed in various domains. However, the use of deep learning models in edge-devices is still limited. Deploying a large model onto small devices for real-time inference requires an adequate amount of resources. In the last couple of years, pruning has evolved as an important and widely used technique to reduce the inference cost and compress the storage-intensive deep learning models for small devices. In this paper, we proposed a novel dual-purpose pruning approach to accelerate the model performance and reduce the storage requirement of the model. The experiments on the CIFAR10 dataset with AlexNet and VGG16 models show that our proposed approach is effective and can be used to make the deployment of the trained model easier for edge-devices with marginal loss of accuracy. For the VGG16 experiment, our approach reduces parameters from 14.98M to 3.7M resulting in a 74.73% reduction in floating-point operations with only 0.8% loss in the accuracy. Read More
Publication: Communications in Computer and Information Science
Publisher: Springer Link
Authors: Tejalal Choudhary, Vipul Mishra, Anurag Goswami
Keywords: Deep Neural Network, Pruning, Model acceleration and compression, Resource-constrained devices
Meet one of the Author:
Affiliations:
Stay In the Know
Get Latest updates and industry insights every month.