What is the Weaviate integration skill?

It provides end-to-end setup of a Weaviate vector database, schema management, data import, vector and hybrid search, and GraphQL querying capabilities.

What can I configure in this integration?

Vectorizer modules (text2vec-*, multi2vec-*), replication factor, sharding, multi-tenancy settings, and module configurations for your data and workload.

What dependencies are required?

weaviate-client and langchain-weaviate are listed dependencies for client interactions and integration with LangChain-powered workflows.

weaviate-integration

Scanned

npx machina-cli add skill a5c-ai/babysitter/weaviate-integration --openclaw

Files (1)

SKILL.md

1.2 KB

Weaviate Integration Skill

Capabilities

Set up Weaviate cluster (cloud or self-hosted)
Define schemas with properties and vectorizers
Implement GraphQL queries
Configure hybrid search (vector + keyword)
Set up multi-tenancy
Implement batch import operations

Target Processes

vector-database-setup
rag-pipeline-implementation

Implementation Details

Core Operations

Schema Management: Class definitions and properties
Data Import: Single and batch object creation
Vector Search: nearVector, nearText queries
Hybrid Search: Combined vector and BM25
GraphQL: Flexible querying with Get and Aggregate

Configuration Options

Vectorizer modules (text2vec-, multi2vec-)
Replication factor
Sharding configuration
Multi-tenancy settings
Module configuration

Best Practices

Design schema for query patterns
Use appropriate vectorizer
Enable hybrid search for better recall
Configure proper backups
Monitor resource usage

Dependencies

weaviate-client
langchain-weaviate

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/weaviate-integration/SKILL.md

View on GitHub

Overview

This skill enables setting up a Weaviate vector database (cloud or self-hosted), defining schemas with vectorizers, and implementing GraphQL queries plus hybrid search. It also covers multi-tenancy and batch data imports for scalable, semantically-rich applications.

How This Skill Works

It defines schemas with class properties and vectorizers, imports data (single or batch), and enables vector search via nearVector and nearText. Hybrid search combines vector results with keyword ranking (BM25), while GraphQL Get and Aggregate queries provide flexible data access; configuration options tailor vectorizers, replication, sharding, and multi-tenancy settings.

When to Use It

Building a RAG pipeline that relies on a semantic vector store for retrieval
Setting up a Weaviate cluster (cloud or self-hosted) for enterprise search or knowledge base access
Implementing GraphQL queries and aggregates to power analytics and dashboards
Enabling hybrid search (vector + keyword) to improve recall in product, document, or support search
Managing multi-tenant data isolation in a SaaS or multi-customer environment

Quick Start

Step 1: Initialize and configure a Weaviate cluster (cloud or self-hosted) and install he required clients like weaviate-client
Step 2: Define schemas with class definitions, properties, and vectorizers suitable for your data
Step 3: Import data (single or batch), enable nearVector/nearText searches, and set up hybrid search and GraphQL queries

Best Practices

Design schema carefully around common query patterns and expected vectorizers
Choose appropriate vectorizer modules (text2vec-*, multi2vec-*) for your data
Enable hybrid search to boost recall without sacrificing precision
Configure regular backups and restore tests for Weaviate data
Monitor resource usage (CPU/memory) and optimize replication and sharding settings

Example Use Cases

RAG-powered knowledge assistant using nearText queries over a document collection
Product catalog search combining embeddings with BM25 keywords for fast, relevant results
Document ingestion workflow with single and batch imports for large corpora
Multi-tenant chatbot with isolated namespaces for each customer
Analytics dashboards powered by GraphQL aggregates over semantically enriched data

Frequently Asked Questions

Add this skill to your agents