What deployment modes are supported?

Ephemeral (in-memory), Persistent (local file-based), and Client-Server (Chroma server deployment).

How do I configure distance metrics?

Specify the distance metric (l2, ip, cosine) in the collection or server configuration to influence similarity scoring.

What dependencies are required?

chromadb and langchain-chroma are required to run Chroma integration and embedding workflows.

chroma-integration

npx machina-cli add skill a5c-ai/babysitter/chroma-integration --openclaw

Files (1)

SKILL.md

1.2 KB

Chroma Integration Skill

Capabilities

Set up Chroma (ephemeral, persistent, client-server)
Create and manage collections
Implement document ingestion with embeddings
Configure metadata filtering
Set up multi-tenant collections
Implement where and where_document filters

Target Processes

vector-database-setup
rag-pipeline-implementation

Implementation Details

Deployment Modes

Ephemeral: In-memory for testing
Persistent: Local file-based storage
Client-Server: Chroma server deployment

Core Operations

Collection creation with embedding functions
Add/update/delete documents
Query with filters
Metadata management

Configuration Options

Embedding function selection
Persistence directory
Distance metric (l2, ip, cosine)
Collection metadata
Server configuration

Best Practices

Use persistent mode for development
Deploy server mode for production
Design metadata schema upfront
Implement proper ID strategies

Dependencies

chromadb
langchain-chroma

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/chroma-integration/SKILL.md

View on GitHub

Overview

Chroma integration provides local vector database setup and operations for development and production environments. It supports ephemeral, persistent, and client-server modes, enabling collection management, document ingestion with embeddings, and metadata filtering. It also supports multi-tenant setups and tailored query filters.

How This Skill Works

Configure deployment mode (ephemeral, persistent, or client-server), select embedding function, and specify a persistence directory or server config. Create and manage collections, ingest documents with embeddings, and apply metadata filters during queries. Use distance metrics (l2, ip, cosine) and metadata schemas to drive precise retrieval.

When to Use It

During development to prototype embedding-backed search with in-memory data (ephemeral).
When you need persistent storage for long-running projects (local file-based).
To run a Chroma server for production with client-server access.
When designing multi-tenant collections and applying metadata filters.
When tuning distance metrics and embedding models for retrieval quality.

Quick Start

Step 1: Choose deployment mode (ephemeral, persistent, or client-server) and set your persistence dir or server config.
Step 2: Create a collection with an embedding function and optional metadata schema; ingest documents with embeddings.
Step 3: Query with filters and refine metadata; iterate on distance metric and embeddings as needed.

Best Practices

Prefer persistent mode for development to preserve work.
Deploy a Chroma server in production for scalability and isolation.
Design a clear metadata schema before ingesting documents.
Implement robust ID strategies for deterministic retrieval.
Choose an appropriate distance metric (l2, ip, cosine) for your data.

Example Use Cases

Ingest a set of product docs into a collection with embeddings and metadata.
Query the collection with filters to retrieve relevant results.
Set up multi-tenant collections for different teams or users.
Switch from ephemeral to persistent mode without losing data.
Deploy a Chroma server to enable client-server access in prod.

Frequently Asked Questions

Add this skill to your agents