Sneha Hanumanthaiah


Research Software Engineer

Publications

Sneha Hanumanthaiah, Peipei Wu, Alex Mackin, Andrew Collins, Tarek Elsaleh, Xiatian Zhu (2025)Optimizing Deep Learning for Edge Deployment via Weight Statistics Aware Network Pruning, In: 2025 IEEE Conference on Standards for Communications and Networking (CSCN 2025)pp. 1-7 Institute of Electrical and Electronics Engineers (IEEE)

Deep learning models have achieved state-of-the-art performance across numerous domains, but their increasing size and computational complexity pose significant challenges for deployment in resource-constrained environments. Model pruning is a key technique to address this issue by reducing the number of model parameters. However, existing methods often present a trade-off between compression rate, computational speed-up, and performance preservation. This paper introduces a novel hybrid pruning methodology that strategically combines Weight Statistics Aware Pruning (WSAP)-based unstructured pruning with hardware-friendly structured channel pruning. Our approach first determines WSAP-driven pruning ratios using a heuristic based on the weights' Coefficient of Variation (CoV), allowing for more aggressive pruning of less critical layers. It then applies both fine-grained and channel-based pruning to maximize model compression while preserving accuracy. We demonstrate the effectiveness and generality of our method on two diverse tasks: Video Quality Assessment (VQA) with the DOVER-Mobile model and Time-Series Forecasting with the CrossFormer model. Our results show that the proposed hybrid method achieves a superior balance of efficiency and performance, reducing model parameters by up to 80% and FLOPs by over 50% while maintaining the accuracy of the original models. These improvements make our method well-suited for trustworthy and efficient deployment of deep learning models in shared and constrained environments.