milvus-integration
Scannednpx machina-cli add skill a5c-ai/babysitter/milvus-integration --openclawFiles (1)
SKILL.md
1.3 KB
Milvus Integration Skill
Capabilities
- Set up Milvus (Lite, Standalone, Cluster)
- Design collection schemas with dynamic fields
- Configure index types (IVF, HNSW, etc.)
- Implement partition strategies
- Set up GPU acceleration
- Handle large-scale data operations
Target Processes
- vector-database-setup
- rag-pipeline-implementation
Implementation Details
Deployment Modes
- Milvus Lite: Embedded for development
- Standalone: Single-node deployment
- Cluster: Distributed deployment with K8s
Core Operations
- Collection and schema management
- Index creation and configuration
- Insert/delete/query operations
- Partition management
- Bulk import
Configuration Options
- Index type selection (IVF_FLAT, IVF_SQ8, HNSW)
- Metric type (L2, IP, COSINE)
- Index parameters (nlist, nprobe, M, efConstruction)
- Partition key configuration
- Resource group assignment
Best Practices
- Choose index type based on scale
- Use partitions for data isolation
- Configure proper nprobe for recall
- Monitor query latency and throughput
Dependencies
- pymilvus
- langchain-milvus
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/milvus-integration/SKILL.mdView on GitHub Overview
This skill configures Milvus as a distributed vector database for large-scale retrieval-augmented generation (RAG) workloads. It covers deployment modes (Lite, Standalone, Cluster), dynamic collection schemas, index configurations, partition strategies, and GPU acceleration to scale data operations.
How This Skill Works
The Milvus integration guides deployment-mode aware setup, collection and schema management, and index/partition configuration. It supports insert/delete/query operations and bulk import, with configurable options for index type, metric, and resource grouping to balance recall, latency, and throughput in large-scale environments.
When to Use It
- Setting up Milvus across Lite, Standalone, or Cluster deployments
- Designing collections with dynamic fields for evolving schemas
- Tuning index types (IVF, HNSW) and metrics to optimize recall and latency
- Implementing partition strategies for data isolation and parallelism
- Deploying Milvus with GPU acceleration for high-throughput RAG workloads
Quick Start
- Step 1: Choose deployment mode (Lite, Standalone, Cluster) and install pymilvus and dependencies
- Step 2: Create a dynamic collection schema, select an index type and parameters, and configure partitions
- Step 3: Connect a client, perform bulk import, then run insert, delete, and query operations while monitoring latency
Best Practices
- Choose index type based on scale and query patterns
- Use partitions to isolate data and improve parallelism
- Tune nprobe, nlist, M, and efConstruction to balance recall and latency
- Monitor query latency and throughput with dashboards
- Validate performance with a representative workload before production
Example Use Cases
- A Kubernetes-based Milvus Cluster powering a large-scale product-support RAG system
- Standalone Milvus setup for an internal knowledge base with tuned IVF_SQ8 indices
- GPU-accelerated Milvus deployment to reduce retrieval latency for embeddings
- Partitioned collections per customer or department to meet data isolation requirements
- Bulk import of millions of vectors from logs for archival-search use cases
Frequently Asked Questions
Add this skill to your agents