LLM Fine-Tuning Techniques
Making foundation models your own. Fine-tuning adapts pre-trained LLMs to specific tasks, domains, or behaviors -- from lightweight LoRA adapters that train in hours on a single GPU to full alignment techniques like DPO that shape model values.

LoRA and Variants
| Method |
Description |
Paper |
Code |
| LoRA |
Low-Rank Adaptation of Large Language Models. Adds trainable low-rank matrices to attention layers while freezing pretrained weights. Rank 4-16 is typically sufficient. |
Paper |
Code |
| LoRA+ |
Introduces different learning rates for matrices A and B (B gets 16x higher LR), yielding ~2% accuracy improvement and 2x faster training. |
Paper |
- |
| AdaLoRA |
Adaptive budget allocation for LoRA - dynamically allocates parameter budget based on importance scoring via SVD parameterization. |
Paper |
Code |
| QLoRA |
Quantizes pretrained model to 4-bit, then adds trainable low-rank adapters. Uses 4-bit NormalFloat, double quantization, and paged optimizers. |
Paper |
Code |
| DoRA |
Weight-Decomposed Low-Rank Adaptation. Decomposes pretrained weights into magnitude and direction, then fine-tunes separately. By NVIDIA. |
Paper |
Code |
| PiSSA |
Principal Singular Values and Singular Vectors Adaptation. Modifies LoRA initialization using SVD for significantly better fine-tuning. |
Paper |
Code |
| MOELoRA |
Combines Mixture of Experts (MOE) with LoRA for multi-task parameter-efficient fine-tuning, especially for medical applications. |
Paper |
Code |
| LoRA-FA |
Freezes matrix A after initialization (as random projection), trains only matrix B. Halves parameter count with comparable performance. |
Paper |
- |
| LoRA-drop |
Algorithm to determine which layers need LoRA fine-tuning and which don't, based on output evaluation. |
Paper |
- |
| Delta-LoRA |
Updates the base weight matrix W using the gradient of AB (difference between consecutive timesteps), controlled by hyperparameter lambda. |
Paper |
- |
Other Fine-Tuning Methods
| Method |
Description |
Paper |
Code |
| PEFT |
HuggingFace library implementing multiple parameter-efficient fine-tuning methods. |
- |
Code |
| Instruction Tuning |
Fine-tuning LLMs on (instruction, output) pairs to improve instruction-following and controllability. |
Paper |
Code |
| Prefix Tuning |
Adds trainable continuous prefixes to each layer while keeping the LM frozen. Task-specific virtual tokens (soft prompts). |
Paper |
Code |
| Prompt Tuning |
Simplified version of Prefix Tuning. Adds soft prompt tokens only at the input layer. |
Paper |
Code |
| P-Tuning |
Converts prompts into learnable embeddings processed by MLP+LSTM. Enabled GPT to surpass BERT on SuperGLUE. |
Paper |
Code |
| P-Tuning v2 |
Adds prompt tokens at every layer (not just input). Removes reparameterization encoder, uses task-specific prompt lengths. |
Paper |
Code |
| Adapter Tuning |
Inserts small adapter modules into each Transformer layer. Only trains adapters and LayerNorm (~3.6% added params). |
Paper |
Code |
| BitFit |
Sparse fine-tuning method that only updates bias parameters. |
Paper |
Code |
| DPO |
Direct Preference Optimization - trains the language model directly as a reward model, eliminating the need for separate reward model training in RLHF. |
Paper |
Code |