Get the FREE Ultimate OpenClaw Setup Guide →

Huggingface

npx machina-cli add skill muhammederem/chief/huggingface --openclaw
Files (1)
SKILL.md
8.1 KB

Hugging Face Transformers

Overview

Hugging Face Transformers is a library providing pre-trained models for Natural Language Processing (NLP), Computer Vision, and Audio tasks. It supports PyTorch, TensorFlow, and JAX.

Installation

pip install transformers datasets evaluate accelerate
# For specific model types
pip install transformers[sentencepiece]  # For tokenizers like SentencePiece

Core Components

Model Loading

from transformers import AutoModel, AutoTokenizer

model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

Pipeline API

from transformers import pipeline

# Text classification
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")

# Question answering
qa = pipeline("question-answering")
result = qa(question="What is AI?", context="Artificial intelligence is...")

# Text generation
generator = pipeline("text-generation", model="gpt2")
result = generator("Once upon a time")

# Named entity recognition
ner = pipeline("ner", aggregation_strategy="simple")
result = ner("Apple is looking at buying U.K. startup")

Tokenization

Basic Usage

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# Single text
tokens = tokenizer("Hello, world!")
print(tokens)  # {'input_ids': [...], 'attention_mask': [...]}

# Multiple texts
tokens = tokenizer(["Hello", "World"], padding=True, truncation=True)

# Decode
text = tokenizer.decode(tokens["input_ids"][0])

Advanced Tokenization

# With return tensors
tokens = tokenizer(
    "Text here",
    padding="max_length",
    truncation=True,
    max_length=512,
    return_tensors="pt"  # Return PyTorch tensors
)

# Slow vs fast tokenizers
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", use_fast=True)

Fine-Tuning

Prepare Dataset

from datasets import load_dataset

dataset = load_dataset("glue", "mrpc")

# Tokenize
def tokenize_function(examples):
    return tokenizer(
        examples["sentence1"],
        examples["sentence2"],
        padding="max_length",
        truncation=True,
        max_length=128,
    )

tokenized_datasets = dataset.map(tokenize_function, batched=True)

Training with Trainer API

from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer

model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=2
)

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
)

trainer.train()

Training with Custom Loop

from transformers import AdamW, get_linear_schedule_with_warmup
from torch.utils.data import DataLoader

optimizer = AdamW(model.parameters(), lr=2e-5)
dataloader = DataLoader(tokenized_datasets["train"], batch_size=16)

num_epochs = 3
num_training_steps = num_epochs * len(dataloader)

scheduler = get_linear_schedule_with_warmup(
    optimizer,
    num_warmup_steps=0,
    num_training_steps=num_training_steps
)

model.train()
for epoch in range(num_epochs):
    for batch in dataloader:
        outputs = model(**batch)
        loss = outputs.loss
        loss.backward()

        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

Parameter-Efficient Fine-Tuning (PEFT)

LoRA (Low-Rank Adaptation)

from peft import LoraConfig, get_peft_model

peft_config = LoraConfig(
    task_type="SEQ_CLS",
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
)

model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

QLoRA (Quantized LoRA)

from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b",
    quantization_config=bnb_config,
    device_map="auto",
)

Model Architectures

BERT-Based Models

  • BERT: Bidirectional Encoder Representations from Transformers
  • RoBERTa: Optimized BERT training
  • DistilBERT: Smaller, faster BERT
  • ALBERT: A Lite BERT

GPT-Based Models

  • GPT-2, GPT-3: Autoregressive language models
  • Llama 2: Open-source LLM from Meta
  • Mistral: Efficient open-source LLM

T5-Based Models

  • T5: Text-to-Text Transfer Transformer
  • FLAN-T5: Instruction-tuned T5

Vision Models

  • ViT: Vision Transformer
  • Swin: Swin Transformer
  • CLIP: Contrastive Language-Image Pre-training

Common Tasks

Text Classification

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=3  # For 3-class classification
)

Question Answering

from transformers import AutoModelForQuestionAnswering

model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")

Summarization

from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
summary = summarizer(article_text, max_length=130, min_length=30)

Translation

translator = pipeline("translation_en_to_de", model="Helsinki-NLP/opus-mt-en-de")
result = translator("Hello, how are you?")

Text Generation

generator = pipeline("text-generation", model="gpt2")
generated = generator(
    "The future of AI is",
    max_length=100,
    num_return_sequences=3,
    temperature=0.7,
)

Model Hub Integration

Upload Model

from huggingface_hub import login, upload_folder

login(token="your_token_here")

model.push_to_hub("your-username/your-model-name")
tokenizer.push_to_hub("your-username/your-model-name")

Load from Hub

model = AutoModel.from_pretrained("username/model-name")

Model Cards

Always include a model card with:

  • Model description
  • Training data
  • Intended uses
  • Limitations
  • Ethical considerations

Best Practices

1. Use the Right Model for the Task

  • Classification: BERT, RoBERTa
  • Generation: GPT, Llama, Mistral
  • QA: BERT-large, RoBERTa-large
  • Summarization: BART, T5

2. Handle Long Sequences

# Sliding window approach
from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="bert-base-uncased")
results = classifier(long_text, truncation=True, max_length=512)

3. Dynamic Padding

from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

4. Evaluation Metrics

import evaluate

accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

predictions = trainer.predict(tokenized_datasets["validation"])
metrics = {
    "accuracy": accuracy.compute(predictions=predictions),
    "f1": f1.compute(predictions=predictions),
}

5. Save and Load

# Save
model.save_pretrained("./my-model")
tokenizer.save_pretrained("./my-model")

# Load
model = AutoModel.from_pretrained("./my-model")
tokenizer = AutoTokenizer.from_pretrained("./my-model")

Performance Optimization

Flash Attention

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b",
    use_flash_attention_2=True,
)

BetterTransformer

from transformers import BetterTransformer

model = BetterTransformer.transform(model)

torch.compile (PyTorch 2.0+)

import torch

model = torch.compile(model)

Integration

  • LangChain: Use Hugging Face models in LLM applications
  • Vector Databases: Generate embeddings for semantic search
  • MLflow: Track training experiments
  • SageMaker: Deploy at scale

Source

git clone https://github.com/muhammederem/chief/blob/main/.claude/skills/ml-ai/huggingface/SKILL.mdView on GitHub

Overview

Hugging Face Transformers is a library providing pre-trained models for Natural Language Processing (NLP), Computer Vision, and Audio tasks, with support for PyTorch, TensorFlow, and JAX. It offers model loading, tokenization, pipelines, and fine-tuning utilities to accelerate AI development.

How This Skill Works

Models are loaded with AutoModel and AutoTokenizer, then applied via the Pipeline API for tasks like sentiment analysis, question answering, text generation, or named entity recognition. You can fine-tune using the Trainer API or a custom training loop, and explore parameter-efficient tuning with PEFT such as LoRA.

When to Use It

  • You need quick NLP tasks like sentiment analysis, QA, or NER using pre-trained models via simple pipelines
  • You want to fine-tune a model on a custom dataset using the Trainer API
  • You need tokenization and preprocessing with AutoTokenizer for consistent input
  • You want end-to-end workflows for NLP, CV, or audio tasks using the Pipeline API
  • You want to experiment with parameter-efficient fine-tuning using PEFT like LoRA

Quick Start

  1. Step 1: Install transformers along with datasets, evaluate, and accelerate
  2. Step 2: Load a model and tokenizer with AutoModel/AutoTokenizer or use a ready-made pipeline
  3. Step 3: Run a simple inference or fine-tune with Trainer on a tokenized dataset

Best Practices

  • Choose the right pipeline for the task: sentiment-analysis, question-answering, text-generation, or ner
  • Load models with AutoModel and AutoTokenizer to ensure compatibility with the chosen weights
  • Prefer fast tokenizers when available to speed up preprocessing
  • Use datasets.load_dataset to prepare data and apply a tokenization function during mapping
  • Fine-tune with Trainer or a custom loop and save the best model at checkpoints

Example Use Cases

  • Text classification with the sentiment-analysis pipeline on product reviews
  • Question answering over documents using the question-answering pipeline
  • Text generation with a GPT-2 style model for story prompts
  • Named entity recognition on news articles with the ner pipeline
  • Fine-tuning a BERT model on the GLUE MRPC dataset using the Trainer API

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers