A generalist reinforcement learning agent for compressing multiple convolutional neural networks
Citation
Share
Abstract
Deep Learning has achieved state-of-the-art accuracy in multiple fields. A common practice in computer vision is to reuse a pre-trained model for a completely different dataset of the same type of task, a process known as transfer learning, which reduces training time by reusing the filters of the convolutional layers. However, while transfer learning can reduce training time, the model might overestimate the number of parameters needed for the new dataset. As models now achieve near-human performance or better, there is a growing need to reduce their size to facilitate deployment on devices with limited computational resources. Various compression techniques have been proposed to address this issue, but their effectiveness varies depending on hyperparameters. To navigate these options, researchers have worked on automating model compression. Some have proposed using reinforcement learning to teach a deep learning model how to compress another deep learning model. This study compares multiple approaches for automating the compression of convolutional neural networks and proposes a method for training a reinforcement learning agent that works across multiple datasets without the need for transfer learning. The agents were tested using leaveone- out cross-validation, learning to compress a set of LeNet-5 models and testing on another LeNet-5 model with different parameters. The metrics used to evaluate these solutions were accuracy loss and the number of parameters of the compressed model. The agents suggested compression schemes that were on or near the Pareto front for these metrics. Furthermore, the models were compressed by more than 80% with minimal accuracy loss in most cases. The significance of these results is that by escalating this methodology for larger models and datasets, an AI assistant for model compression similar to ChatGPT can be developed, potentially revolutionizing model compression practices and enabling advanced deployments in resource-constrained environments.
Description
https://orcid.org/0000-0001-6270-3164