huggingface-classifier
Scannednpx machina-cli add skill a5c-ai/babysitter/huggingface-classifier --openclawFiles (1)
SKILL.md
1.2 KB
HuggingFace Classifier Skill
Capabilities
- Fine-tune transformer models for classification
- Configure training pipelines with Trainer API
- Implement inference with optimizations
- Design label schemas and mappings
- Set up model evaluation and metrics
- Deploy models with HF Inference API
Target Processes
- intent-classification-system
- entity-extraction-slot-filling
Implementation Details
Model Types
- BERT-based: bert-base-uncased, distilbert
- RoBERTa-based: roberta-base, xlm-roberta
- DeBERTa: deberta-v3-base
- Domain-specific: FinBERT, BioBERT
Training Configuration
- Dataset preparation
- Tokenization settings
- Training arguments
- Evaluation metrics
- Early stopping
Configuration Options
- Model selection
- Number of labels
- Training hyperparameters
- Batch sizes
- Learning rate schedules
Best Practices
- Use appropriate base model
- Proper train/val/test splits
- Monitor for overfitting
- Evaluate on representative data
Dependencies
- transformers
- datasets
- accelerate
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/huggingface-classifier/SKILL.mdView on GitHub Overview
Fine-tune transformer models for intent classification and deploy scalable inferences. This skill covers model selection, training pipelines with the Trainer API, evaluation metrics, and deployment via the HF Inference API, plus label schemas and mappings.
How This Skill Works
It supports multiple model families: BERT-based, RoBERTa-based, DeBERTa and domain specific variants like FinBERT and BioBERT. You prepare a dataset, run tokenization, configure training arguments and metrics, and fine tune using the Trainer API with early stopping. Inference is performed with optimized HF inference API and can be deployed to production with scalable endpoints.
When to Use It
- Building an enterprise chatbot to route customer queries to the correct team based on detected intent
- Implementing a multi label or multi intent classifier for a support desk or help center
- Domain specific intent classification using FinBERT for finance or BioBERT for healthcare
- Deploying a scalable inference pipeline via Hugging Face Inference API for real time classification
- Iteratively improving model performance with train/val/test splits and monitoring metrics
Quick Start
- Step 1: Select a base model (BERT, RoBERTa, DeBERTa or domain specific) and prepare a labeled dataset with intents
- Step 2: Tokenize data, configure training arguments and metrics, and fine tune using the Trainer API with early stopping
- Step 3: Run evaluation, then deploy the trained model via the HF Inference API and integrate into your application
Best Practices
- Choose an appropriate base model that matches your domain and text length
- Create proper train, validation and test splits that reflect real usage
- Monitor for overfitting and validate on representative data
- Experiment with label schemas and mappings to ensure clear intents
- Validate deployment with HF Inference API and monitor latency
Example Use Cases
- Retail bot routing orders and returns intents to the right service team
- Tech support ticket classification by issue type to speed up triage
- Banking assistant handling balance and statement queries with FinBERT
- Healthcare assistant classifying triage related intents using BioBERT
- Multilingual support with cross lingual intents using XLM-Roberta
Frequently Asked Questions
Add this skill to your agents