spacy-ner
npx machina-cli add skill a5c-ai/babysitter/spacy-ner --openclawFiles (1)
SKILL.md
1.1 KB
spaCy NER Skill
Capabilities
- Train custom spaCy NER models
- Configure entity extraction pipelines
- Design annotation schemas
- Implement entity linking
- Set up model evaluation
- Deploy efficient NER inference
Target Processes
- entity-extraction-slot-filling
- chatbot-design-implementation
Implementation Details
spaCy Components
- NER: Named Entity Recognition
- EntityLinker: Link to knowledge bases
- EntityRuler: Rule-based matching
- SpanCategorizer: Overlapping entities
Training Configuration
- config.cfg setup
- Training data format (spaCy v3)
- Augmentation strategies
- Evaluation metrics
Configuration Options
- Base model selection (en_core_web_*)
- Custom entity types
- Training parameters
- GPU acceleration
- Model packaging
Best Practices
- Quality annotation data
- Balance entity types
- Use prodigy for annotation
- Regular model evaluation
Dependencies
- spacy
- spacy-transformers (optional)
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/spacy-ner/SKILL.mdView on GitHub Overview
Build and deploy custom spaCy NER models for chat-based systems. This skill covers designing annotation schemas, configuring entity extraction pipelines, linking entities to knowledge bases, evaluating model performance, and efficient inference deployment.
How This Skill Works
Leverage spaCy components (NER, EntityLinker, EntityRuler, SpanCategorizer) to train, annotate, and link entities within dialogues. Use a spaCy v3 pipeline defined in config.cfg, prepare training data, apply augmentation, and perform model evaluation before packaging for deployment.
When to Use It
- Designing a slot-filling chatbot that requires accurate entity extraction
- Building a conversational agent with knowledge-base linkage for entities
- Training domain-specific entity types (products, symptoms, dates, etc.)
- Setting up and measuring model performance with defined evaluation metrics
- Deploying a trained NER model in production with GPU-accelerated inference
Quick Start
- Step 1: Define domain-specific entities and design an annotation schema; annotate sample dialogs (consider using Prodigy).
- Step 2: Configure a spaCy v3 pipeline (NER, EntityLinker, EntityRuler) and format training data in config.cfg.
- Step 3: Train the model, evaluate with defined metrics, and package the model for deployment
Best Practices
- Quality annotation data
- Balance entity types
- Use Prodigy or similar tools for efficient annotation
- Regular model evaluation with defined metrics
- Leverage GPU acceleration when available and appropriate
Example Use Cases
- Healthcare chatbot extracting patient name, symptoms, and medication from user messages
- E-commerce assistant identifying products, brands, and prices for slot filling
- Support bot linking entities to a product knowledge base via EntityLinker
- Customer service agent evaluating NER with F1/precision/recall on a held-out set
- Live deployment of a spaCy NER model with GPU-accelerated inference in production
Frequently Asked Questions
Add this skill to your agents