Learn how to learn and distil during learning - Using meta-learning and second order optimisation to prune the model
Date:
The LLM era has make the model become extremely large and cumbersome for mobile devices. Hence, pruning such models is demanding. In this talk, I will give a brief introduction about my internship work at MediaTek about model pruning by using second-order optimisation
Papers Fisher-Legendre (FishLeg) optimization of deep neural networks: https://openreview.net/pdf?id=c9lAOPvQHS
Second order derivatives for network pruning: Optimal Brain Surgeon: https://proceedings.neurips.cc/paper/1992/file/303ed4c69846ab36c2904d3ba8573050-Paper.pdf
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models: https://arxiv.org/pdf/2203.07259.pdf
Efficient Model Compression Techniques with FishLeg: https://arxiv.org/pdf/2412.02328
