DevConf.IN 2026

Memory-Efficient AI: How PEFT and PyTorch Enable Accessible LLM Fine-Tuning
2026-02-13 , VYAS - 1 - Room#VY103

The proliferation of large language (LLMs) with billions of parameters has created a significant barrier to entry for fine tuning, full fine tuning of a 7B parameter model requires over 80GB of GPU memory and produces multi gigabyte checkpoints for each task. Parameter Efficient Fine Tuning (PEFT) addresses this challenge by training only 0.1-2% of model parameters while achieving performance comparable to full fine tuning, reducing memory requirements by 3-4x and checkpoint sizes from gigabytes to megabytes.
This talk will explore how PyTorch's architectural features including module system, autograd engine enable practical PEFT implementation.
This talk will demonstrate popular methods including LoRA, Prefix Tuning, showing how PyTorch's nn.ModuleDict enables dynamic adapter management, how custom CUDA extensions optimize performance.
Attendees will gain knowledge of implementing PEFT methods and leveraging PyTorch's advanced features for efficient model adaption, making large scale AI accessible with limited computational resources.


What level of experience should the audience have to best understand your session?: Intermediate - attendees should be familiar with the subject

Parshant is a Associate ML Engineer at Red Hat and a Gold Medalist in his Master’s in CSE with AIML specialization. He has authored four SCOPUS-indexed research papers in AI/ML. At Red Hat, he contributes to upstream open-source projects like PyTorch, Helion. He also has hands-on experience in AI compilers and works with open-source compiler frameworks like LLVM and MLIR, bridging ML workloads with systems-level optimization.