Get the FREE Ultimate OpenClaw Setup Guide →

rag-hybrid-search

npx machina-cli add skill a5c-ai/babysitter/rag-hybrid-search --openclaw
Files (1)
SKILL.md
7.3 KB

rag-hybrid-search

Implement hybrid search combining semantic vector retrieval with keyword-based BM25 search for improved RAG pipeline accuracy and recall.

Overview

Hybrid search addresses the limitations of pure semantic or pure keyword search:

  • Semantic search excels at conceptual similarity but may miss exact matches
  • Keyword search finds exact terms but lacks semantic understanding
  • Hybrid combines both for superior retrieval performance

Capabilities

Search Strategies

  • Dense vector semantic search (embeddings)
  • Sparse vector keyword search (BM25, TF-IDF)
  • Hybrid fusion with configurable weighting
  • Reciprocal Rank Fusion (RRF) combination

Retrieval Configuration

  • Configure embedding models for dense search
  • Tune BM25 parameters (k1, b values)
  • Set retrieval limits and thresholds
  • Apply metadata filtering

Ranking & Reranking

  • Score normalization across search types
  • Weighted score fusion
  • Cross-encoder reranking
  • MMR (Maximum Marginal Relevance) diversity

Index Management

  • Create and update hybrid indexes
  • Batch indexing with progress tracking
  • Index optimization and maintenance
  • Multi-index federation

Usage

Basic Hybrid Search with LangChain

from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma
from langchain.retrievers import EnsembleRetriever
from langchain_openai import OpenAIEmbeddings

# Create documents
docs = [...]  # Your document chunks

# Dense retriever (semantic)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

# Sparse retriever (BM25)
bm25_retriever = BM25Retriever.from_documents(docs)
bm25_retriever.k = 5

# Hybrid ensemble
hybrid_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, dense_retriever],
    weights=[0.4, 0.6]  # Adjust based on use case
)

# Query
results = hybrid_retriever.invoke("How do I configure the system?")

Reciprocal Rank Fusion

def reciprocal_rank_fusion(results_lists: list, k: int = 60) -> list:
    """
    Combine multiple ranked lists using RRF.
    k is a constant (typically 60) for smoothing.
    """
    fused_scores = {}

    for results in results_lists:
        for rank, doc in enumerate(results):
            doc_id = doc.metadata.get("id", str(doc.page_content[:50]))
            if doc_id not in fused_scores:
                fused_scores[doc_id] = {"doc": doc, "score": 0}
            fused_scores[doc_id]["score"] += 1 / (k + rank + 1)

    # Sort by fused score
    sorted_docs = sorted(
        fused_scores.values(),
        key=lambda x: x["score"],
        reverse=True
    )

    return [item["doc"] for item in sorted_docs]

# Use with multiple retrievers
semantic_results = dense_retriever.invoke(query)
keyword_results = bm25_retriever.invoke(query)
hybrid_results = reciprocal_rank_fusion([semantic_results, keyword_results])

Pinecone Hybrid Search

from pinecone import Pinecone
from pinecone_text.sparse import BM25Encoder

# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")
index = pc.Index("hybrid-index")

# Prepare sparse encoder
bm25 = BM25Encoder()
bm25.fit(corpus)  # Fit on your document corpus

def hybrid_query(query: str, alpha: float = 0.5, top_k: int = 10):
    """
    Query with hybrid search.
    alpha: weight for dense vectors (1-alpha for sparse)
    """
    # Get dense embedding
    dense_embedding = embeddings.embed_query(query)

    # Get sparse embedding
    sparse_embedding = bm25.encode_queries([query])[0]

    # Hybrid query
    results = index.query(
        vector=dense_embedding,
        sparse_vector=sparse_embedding,
        top_k=top_k,
        include_metadata=True
    )

    return results

Weaviate Hybrid Search

import weaviate

client = weaviate.Client("http://localhost:8080")

def weaviate_hybrid_search(query: str, alpha: float = 0.5, limit: int = 10):
    """
    Weaviate native hybrid search.
    alpha: 0 = pure BM25, 1 = pure vector
    """
    result = (
        client.query
        .get("Document", ["content", "title", "metadata"])
        .with_hybrid(
            query=query,
            alpha=alpha,
            properties=["content", "title"]
        )
        .with_limit(limit)
        .do()
    )

    return result["data"]["Get"]["Document"]

Task Definition

const ragHybridSearchTask = defineTask({
  name: 'rag-hybrid-search-setup',
  description: 'Configure hybrid search for RAG pipeline',

  inputs: {
    vectorStore: { type: 'string', required: true },  // 'pinecone', 'weaviate', 'chroma', etc.
    embeddingModel: { type: 'string', default: 'text-embedding-3-small' },
    bm25Params: { type: 'object', default: { k1: 1.5, b: 0.75 } },
    fusionStrategy: { type: 'string', default: 'rrf' },  // 'rrf', 'weighted', 'custom'
    denseWeight: { type: 'number', default: 0.6 },
    topK: { type: 'number', default: 10 }
  },

  outputs: {
    retrieverConfigured: { type: 'boolean' },
    indexStats: { type: 'object' },
    artifacts: { type: 'array' }
  },

  async run(inputs, taskCtx) {
    return {
      kind: 'skill',
      title: `Configure hybrid search with ${inputs.vectorStore}`,
      skill: {
        name: 'rag-hybrid-search',
        context: {
          vectorStore: inputs.vectorStore,
          embeddingModel: inputs.embeddingModel,
          bm25Params: inputs.bm25Params,
          fusionStrategy: inputs.fusionStrategy,
          denseWeight: inputs.denseWeight,
          topK: inputs.topK,
          instructions: [
            'Validate vector store connection and configuration',
            'Set up dense embedding pipeline',
            'Configure BM25/sparse encoding',
            'Implement fusion strategy',
            'Test retrieval quality with sample queries',
            'Document configuration and tuning parameters'
          ]
        }
      },
      io: {
        inputJsonPath: `tasks/${taskCtx.effectId}/input.json`,
        outputJsonPath: `tasks/${taskCtx.effectId}/result.json`
      }
    };
  }
});

Applicable Processes

  • rag-pipeline-implementation
  • advanced-rag-patterns
  • knowledge-base-qa
  • vector-database-setup

External Dependencies

  • Vector database (Pinecone, Weaviate, Chroma, Milvus, Qdrant)
  • Embedding provider (OpenAI, Cohere, Hugging Face)
  • BM25 encoder (rank_bm25, pinecone-text)

References

Related Skills

  • SK-RAG-001 rag-chunking-strategy
  • SK-RAG-004 rag-reranking
  • SK-RAG-005 rag-query-transformation
  • SK-VDB-001 through SK-VDB-005 (vector database integrations)

Related Agents

  • AG-RAG-001 rag-pipeline-architect
  • AG-RAG-003 vector-db-specialist
  • AG-RAG-004 retrieval-optimizer

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/rag-hybrid-search/SKILL.mdView on GitHub

Overview

Hybrid search blends dense semantic vector retrieval with BM25 keyword retrieval to improve RAG recall and precision. It combines both signals through configurable fusion strategies and optional reranking to handle diverse queries. This approach supports index management and multi-index federation for scalable retrieval.

How This Skill Works

At query time, a dense retriever using embeddings and a sparse BM25 retriever run in parallel. Their results are merged via weighted fusion or methods like Reciprocal Rank Fusion (RRF), with optional cross-encoder reranking or MMRed diversity. The setup also covers retrieval configuration, score normalization, and index management across hybrid indexes.

When to Use It

  • When you need both semantic understanding and exact term matching to improve results
  • When your corpus contains synonyms or varied terminology across domains
  • When you want a tunable balance between dense and sparse signals with adjustable weights
  • When you need diversity and recall improvements via RRF or MMRed approaches
  • When using multi-index federation and metadata filtering to scope results

Quick Start

  1. Step 1: Build a dense retriever with embeddings and a vector store from your documents
  2. Step 2: Build aBM25 retriever from the same documents and set k (e.g., 5)
  3. Step 3: Create an EnsembleRetriever with weights and run a query via hybrid_retriever.invoke(query)

Best Practices

  • Start with balanced weights (e.g., 0.4 for BM25, 0.6 for dense) and adjust per domain
  • Tune BM25 parameters (k1, b) and select appropriate embedding models
  • Normalize scores before fusion to ensure fair blending across modalities
  • Apply cross-encoder reranking on the top-N results for improved quality
  • Use RRF or MMR to boost diversity and recall across results

Example Use Cases

  • Basic Hybrid Search with LangChain using an EnsembleRetriever for BM25 and dense retrieval
  • Reciprocal Rank Fusion example combining semantic and keyword results
  • Pinecone Hybrid Search integrating dense embeddings with BM25Encoder sparse signals
  • Configuring retrieval settings: embedding models, BM25 k1/b, and thresholds
  • Index management: creating/updating hybrid indexes and enabling multi-index federation

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers