Fine-Tuning

(6 skills)

AI agent skills tagged “Fine-Tuning” for Claude Code, Cursor, Windsurf, and more.

axolotl

Orchestra-Research/AI-Research-SKILLs

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

unsloth

Orchestra-Research/AI-Research-SKILLs

4.3k

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

implementing-llms-litgpt

Orchestra-Research/AI-Research-SKILLs

4.3k

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

llama-factory

Orchestra-Research/AI-Research-SKILLs

4.3k

Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA, multimodal support

peft-fine-tuning

Orchestra-Research/AI-Research-SKILLs

4.3k

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.

fine-tuning-with-trl

Orchestra-Research/AI-Research-SKILLs

4.3k

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.