HuggingFace

(4 skills)

AI agent skills tagged “HuggingFace” for Claude Code, Cursor, Windsurf, and more.

quantizing-models-bitsandbytes

Orchestra-Research/AI-Research-SKILLs

Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4, FP4 formats, QLoRA training, and 8-bit optimizers. Works with HuggingFace Transformers.

huggingface-accelerate

Orchestra-Research/AI-Research-SKILLs

4.3k

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.

llama-factory

Orchestra-Research/AI-Research-SKILLs

4.3k

Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA, multimodal support

huggingface-tokenizers

Orchestra-Research/AI-Research-SKILLs

4.3k

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.