Efficient Audio-based CNNs via Filter Pruning
Dr. Arshdeep Singh, a machine learning researcher in sound with Professor Mark D. Plumbley as a part of “AI for sound” (AI4S) project within the Centre for Vision, Speech and Signal Processing (CVSSP), have been focusing on designing efficient and sustainable artificial intelligence and machine learning (AI-ML) models.
Recent trends in artificial intelligence (AI) employ convolutional neural networks (CNNs) [1, 2] that provide remarkable performance compared to other existing methods. However, the large size and high computational cost of CNNs is a bottleneck to deploying CNNs on resource-constrained devices such as smartphones. Moreover, training CNNs for several hours leads to emitting more CO2. For instance, a computing device (NVIDIA GPU RTX-2080 Ti) used to train CNNs for 48 hours generates the equivalent CO2 emitted by an average car driven for 13 miles. For estimating CO2, we use an openly available tool [Link-1].
Therefore, we aimed to compress CNNs
- To reduce the computational complexity for faster inference.
- To reduce memory footprints for using underlying resources effectively.
- To reduce the number of computations during the training stage of CNNs by analyzing how many training examples are sufficient in the fine-tuning process of the compressed CNNs to achieve a similar performance to that obtained using all training examples for uncompressed CNNs.
One of the directions to compress CNNs is by “pruning”, where the unimportant filters are explicitly removed from the original network to build a compact or pruned network. After pruning, the pruned network is fine-tuned to regain the performance loss. This study proposes a cosine distance-based greedy algorithm  to prune similar filters in filter space for openly available CNNs designed for audio scene classification [Link-2]. Further, we improve the efficiency of the proposed algorithm  by reducing the computational time in pruning .
We find that the proposed pruning method reduces the number of computations per inference by 27%, with 25% less memory requirements, with less than a 1% drop in accuracy. During fine-tuning of the pruned CNNs, a reduction of training examples by 25% gives a similar performance as that obtained using all examples. We made openly available the proposed algorithm [Link-3] for reproducibility and provided a video presentation [Link-4] explaining the methodology and results from our published work .
In addition, we improve the computational time of the proposed pruning method by three times without degrading performance [4, Link-5].
Open Research practices/URL Links
The proposed work uses the following Open Research practices,
Link-1: Machine Learning CO2 Impact Calculator (mlco2.github.io)
Link-2: GitHub - marmoi/dcase2021_task1a_baseline
Link-3: Proposed pruning Algorithm: GitHub - Arshdeep-Singh-Boparai/passive-filter-pruning-Interspeech22
Link-4: Video presentation: https://youtu.be/Z50nKCYDYEM
Link-5: Proposed efficient pruning Algorithm: GitHub - Arshdeep-Singh-Boparai/Efficient_similarity_Pruning_Algo
 Q Kong et al., “PANNs: Large-scale pretrained audio neural networks for audio pattern recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2880– 2894, 2020.
 Irene et al., “Low-complexity acoustic scene classification for multi-device audio: Analysis of DCASE 2021 challenge systems,” in DCASE workshop, pp. 85-89, 2021.
 A Singh and Mark D Plumbley, “A passive similarity-based CNN filter pruning for efficient acoustic scene classification,” in INTERSPEECH, pp. 2433-2437, 2022.
 A Singh and Mark D Plumbley, “Efficient similarity-based passive filter pruning for compressing CNNs,” accepted for ICASSP 2023.
Arshdeep Singh (1) and Mark D. Plumbley (2)
1: Department of Computer Science and Electrical Engineering, University of Surrey, UK,
2: EPSRC Fellow in “AI for sound” project, Professor of Signal Processing, University of Surrey, UK
Contact information (Arshdeep Singh)
Lead author job title: Research Fellow A
Lead author faculty: Faculty of Engineering and Physical Sciences
Lead author email: firstname.lastname@example.org
Lead author ORCID: https://orcid.org/0000-0003-3465-0952
Publish an Open Research case study
If you are a member of the University of Surrey and would like us to publish an Open Research case study, please read our Open Research case study author guidelines (PDF) to find out how.