What is Hugging Face Transformers?

A library providing pre-trained models for NLP, Computer Vision, and Audio tasks with support for PyTorch, TensorFlow, and JAX.

How do I run a simple task?

Use the Pipeline API for tasks like sentiment-analysis or question-answering, or manually load a model and tokenizer with AutoModel and AutoTokenizer.

How can I fine-tune on my own data?

Prepare a dataset with datasets.load_dataset, tokenize it, and train using the Trainer API or a custom training loop.

Huggingface

npx machina-cli add skill muhammederem/chief/huggingface --openclaw

Files (1)

SKILL.md

8.1 KB

Hugging Face Transformers

Overview

Hugging Face Transformers is a library providing pre-trained models for Natural Language Processing (NLP), Computer Vision, and Audio tasks. It supports PyTorch, TensorFlow, and JAX.

Installation

pip install transformers datasets evaluate accelerate
# For specific model types
pip install transformers[sentencepiece]  # For tokenizers like SentencePiece

Core Components

Model Loading

from transformers import AutoModel, AutoTokenizer

model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

Pipeline API

from transformers import pipeline

# Text classification
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")

# Question answering
qa = pipeline("question-answering")
result = qa(question="What is AI?", context="Artificial intelligence is...")

# Text generation
generator = pipeline("text-generation", model="gpt2")
result = generator("Once upon a time")

# Named entity recognition
ner = pipeline("ner", aggregation_strategy="simple")
result = ner("Apple is looking at buying U.K. startup")

Tokenization

Basic Usage

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# Single text
tokens = tokenizer("Hello, world!")
print(tokens)  # {'input_ids': [...], 'attention_mask': [...]}

# Multiple texts
tokens = tokenizer(["Hello", "World"], padding=True, truncation=True)

# Decode
text = tokenizer.decode(tokens["input_ids"][0])

Advanced Tokenization

# With return tensors
tokens = tokenizer(
    "Text here",
    padding="max_length",
    truncation=True,
    max_length=512,
    return_tensors="pt"  # Return PyTorch tensors
)

# Slow vs fast tokenizers
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", use_fast=True)

Fine-Tuning

Prepare Dataset

from datasets import load_dataset

dataset = load_dataset("glue", "mrpc")

# Tokenize
def tokenize_function(examples):
    return tokenizer(
        examples["sentence1"],
        examples["sentence2"],
        padding="max_length",
        truncation=True,
        max_length=128,
    )

tokenized_datasets = dataset.map(tokenize_function, batched=True)

Training with Trainer API

from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer

model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=2
)

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
)

trainer.train()

Training with Custom Loop

from transformers import AdamW, get_linear_schedule_with_warmup
from torch.utils.data import DataLoader

optimizer = AdamW(model.parameters(), lr=2e-5)
dataloader = DataLoader(tokenized_datasets["train"], batch_size=16)

num_epochs = 3
num_training_steps = num_epochs * len(dataloader)

scheduler = get_linear_schedule_with_warmup(
    optimizer,
    num_warmup_steps=0,
    num_training_steps=num_training_steps
)

model.train()
for epoch in range(num_epochs):
    for batch in dataloader:
        outputs = model(**batch)
        loss = outputs.loss
        loss.backward()

        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

Parameter-Efficient Fine-Tuning (PEFT)

LoRA (Low-Rank Adaptation)

from peft import LoraConfig, get_peft_model

peft_config = LoraConfig(
    task_type="SEQ_CLS",
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
)

model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

QLoRA (Quantized LoRA)

from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b",
    quantization_config=bnb_config,
    device_map="auto",
)

Model Architectures

BERT-Based Models

BERT: Bidirectional Encoder Representations from Transformers
RoBERTa: Optimized BERT training
DistilBERT: Smaller, faster BERT
ALBERT: A Lite BERT

GPT-Based Models

GPT-2, GPT-3: Autoregressive language models
Llama 2: Open-source LLM from Meta
Mistral: Efficient open-source LLM

T5-Based Models

T5: Text-to-Text Transfer Transformer
FLAN-T5: Instruction-tuned T5

Vision Models

ViT: Vision Transformer
Swin: Swin Transformer
CLIP: Contrastive Language-Image Pre-training

Common Tasks

Text Classification

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=3  # For 3-class classification
)

Question Answering

from transformers import AutoModelForQuestionAnswering

model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")

Summarization

from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
summary = summarizer(article_text, max_length=130, min_length=30)

Translation

translator = pipeline("translation_en_to_de", model="Helsinki-NLP/opus-mt-en-de")
result = translator("Hello, how are you?")

Text Generation

generator = pipeline("text-generation", model="gpt2")
generated = generator(
    "The future of AI is",
    max_length=100,
    num_return_sequences=3,
    temperature=0.7,
)

Model Hub Integration

Upload Model

from huggingface_hub import login, upload_folder

login(token="your_token_here")

model.push_to_hub("your-username/your-model-name")
tokenizer.push_to_hub("your-username/your-model-name")

Load from Hub

model = AutoModel.from_pretrained("username/model-name")

Model Cards

Always include a model card with:

Model description
Training data
Intended uses
Limitations
Ethical considerations

Best Practices

1. Use the Right Model for the Task

Classification: BERT, RoBERTa
Generation: GPT, Llama, Mistral
QA: BERT-large, RoBERTa-large
Summarization: BART, T5

2. Handle Long Sequences

# Sliding window approach
from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="bert-base-uncased")
results = classifier(long_text, truncation=True, max_length=512)

3. Dynamic Padding

from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

4. Evaluation Metrics

import evaluate

accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

predictions = trainer.predict(tokenized_datasets["validation"])
metrics = {
    "accuracy": accuracy.compute(predictions=predictions),
    "f1": f1.compute(predictions=predictions),
}

5. Save and Load

# Save
model.save_pretrained("./my-model")
tokenizer.save_pretrained("./my-model")

# Load
model = AutoModel.from_pretrained("./my-model")
tokenizer = AutoTokenizer.from_pretrained("./my-model")

Performance Optimization

Flash Attention

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b",
    use_flash_attention_2=True,
)

BetterTransformer

from transformers import BetterTransformer

model = BetterTransformer.transform(model)

torch.compile (PyTorch 2.0+)

import torch

model = torch.compile(model)

Integration

LangChain: Use Hugging Face models in LLM applications
Vector Databases: Generate embeddings for semantic search
MLflow: Track training experiments
SageMaker: Deploy at scale

Source

git clone https://github.com/muhammederem/chief/blob/main/.claude/skills/ml-ai/huggingface/SKILL.mdView on GitHub

Overview

Hugging Face Transformers is a library providing pre-trained models for Natural Language Processing (NLP), Computer Vision, and Audio tasks, with support for PyTorch, TensorFlow, and JAX. It offers model loading, tokenization, pipelines, and fine-tuning utilities to accelerate AI development.

How This Skill Works

Models are loaded with AutoModel and AutoTokenizer, then applied via the Pipeline API for tasks like sentiment analysis, question answering, text generation, or named entity recognition. You can fine-tune using the Trainer API or a custom training loop, and explore parameter-efficient tuning with PEFT such as LoRA.

When to Use It

You need quick NLP tasks like sentiment analysis, QA, or NER using pre-trained models via simple pipelines
You want to fine-tune a model on a custom dataset using the Trainer API
You need tokenization and preprocessing with AutoTokenizer for consistent input
You want end-to-end workflows for NLP, CV, or audio tasks using the Pipeline API
You want to experiment with parameter-efficient fine-tuning using PEFT like LoRA

Quick Start

Step 1: Install transformers along with datasets, evaluate, and accelerate
Step 2: Load a model and tokenizer with AutoModel/AutoTokenizer or use a ready-made pipeline
Step 3: Run a simple inference or fine-tune with Trainer on a tokenized dataset

Best Practices

Choose the right pipeline for the task: sentiment-analysis, question-answering, text-generation, or ner
Load models with AutoModel and AutoTokenizer to ensure compatibility with the chosen weights
Prefer fast tokenizers when available to speed up preprocessing
Use datasets.load_dataset to prepare data and apply a tokenization function during mapping
Fine-tune with Trainer or a custom loop and save the best model at checkpoints

Example Use Cases

Text classification with the sentiment-analysis pipeline on product reviews
Question answering over documents using the question-answering pipeline
Text generation with a GPT-2 style model for story prompts
Named entity recognition on news articles with the ner pipeline
Fine-tuning a BERT model on the GLUE MRPC dataset using the Trainer API

Frequently Asked Questions

Add this skill to your agents