Get the FREE Ultimate OpenClaw Setup Guide →

chroma-integration

npx machina-cli add skill a5c-ai/babysitter/chroma-integration --openclaw
Files (1)
SKILL.md
1.2 KB

Chroma Integration Skill

Capabilities

  • Set up Chroma (ephemeral, persistent, client-server)
  • Create and manage collections
  • Implement document ingestion with embeddings
  • Configure metadata filtering
  • Set up multi-tenant collections
  • Implement where and where_document filters

Target Processes

  • vector-database-setup
  • rag-pipeline-implementation

Implementation Details

Deployment Modes

  1. Ephemeral: In-memory for testing
  2. Persistent: Local file-based storage
  3. Client-Server: Chroma server deployment

Core Operations

  • Collection creation with embedding functions
  • Add/update/delete documents
  • Query with filters
  • Metadata management

Configuration Options

  • Embedding function selection
  • Persistence directory
  • Distance metric (l2, ip, cosine)
  • Collection metadata
  • Server configuration

Best Practices

  • Use persistent mode for development
  • Deploy server mode for production
  • Design metadata schema upfront
  • Implement proper ID strategies

Dependencies

  • chromadb
  • langchain-chroma

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/chroma-integration/SKILL.mdView on GitHub

Overview

Chroma integration provides local vector database setup and operations for development and production environments. It supports ephemeral, persistent, and client-server modes, enabling collection management, document ingestion with embeddings, and metadata filtering. It also supports multi-tenant setups and tailored query filters.

How This Skill Works

Configure deployment mode (ephemeral, persistent, or client-server), select embedding function, and specify a persistence directory or server config. Create and manage collections, ingest documents with embeddings, and apply metadata filters during queries. Use distance metrics (l2, ip, cosine) and metadata schemas to drive precise retrieval.

When to Use It

  • During development to prototype embedding-backed search with in-memory data (ephemeral).
  • When you need persistent storage for long-running projects (local file-based).
  • To run a Chroma server for production with client-server access.
  • When designing multi-tenant collections and applying metadata filters.
  • When tuning distance metrics and embedding models for retrieval quality.

Quick Start

  1. Step 1: Choose deployment mode (ephemeral, persistent, or client-server) and set your persistence dir or server config.
  2. Step 2: Create a collection with an embedding function and optional metadata schema; ingest documents with embeddings.
  3. Step 3: Query with filters and refine metadata; iterate on distance metric and embeddings as needed.

Best Practices

  • Prefer persistent mode for development to preserve work.
  • Deploy a Chroma server in production for scalability and isolation.
  • Design a clear metadata schema before ingesting documents.
  • Implement robust ID strategies for deterministic retrieval.
  • Choose an appropriate distance metric (l2, ip, cosine) for your data.

Example Use Cases

  • Ingest a set of product docs into a collection with embeddings and metadata.
  • Query the collection with filters to retrieve relevant results.
  • Set up multi-tenant collections for different teams or users.
  • Switch from ephemeral to persistent mode without losing data.
  • Deploy a Chroma server to enable client-server access in prod.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers