Get the FREE Ultimate OpenClaw Setup Guide →

Vector_dbs

npx machina-cli add skill muhammederem/chief/vector_dbs --openclaw
Files (1)
SKILL.md
8.2 KB

Vector Databases for Semantic Search

Overview

Vector databases store and query high-dimensional vector embeddings, enabling semantic search, recommendation systems, and RAG applications.

Key Concepts

Embeddings

Vector representations of text/images that capture semantic meaning:

  • Text: 384-1536 dimensions (OpenAI, Sentence Transformers)
  • Images: 512-2048 dimensions (CLIP, ResNet)
  • Dense vectors vs sparse vectors

Similarity Metrics

  • Cosine Similarity: Angle between vectors (range: -1 to 1)
  • Euclidean Distance: Straight-line distance
  • Dot Product: Unnormalized cosine similarity

Key Operations

  1. Index: Store vectors with metadata
  2. Search: Find nearest neighbors
  3. Delete: Remove vectors
  4. Update: Modify vectors or metadata

Pinecone

Setup

pip install pinecone-client
import pinecone

# Initialize
pinecone.init(
    api_key="your-api-key",
    environment="us-west1-gcp"
)

# Create index
pinecone.create_index(
    name="my-index",
    dimension=1536,
    metric="cosine",
    pod_type="p1.x1"
)

# Connect
index = pinecone.Index("my-index")

Basic Operations

# Upsert vectors
index.upsert(
    vectors=[
        ("vec1", [0.1, 0.2, ...], {"category": "tech"}),
        ("vec2", [0.3, 0.4, ...], {"category": "news"})
    ]
)

# Query
results = index.query(
    vector=[0.1, 0.2, ...],
    top_k=10,
    include_metadata=True
)

# Delete
index.delete(ids=["vec1", "vec2"])

Filtering

results = index.query(
    vector=query_vector,
    filter={"category": {"$eq": "tech"}},
    top_k=10
)

Weaviate

Setup

pip install weaviate-client
import weaviate

# Connect
client = weaviate.Client("http://localhost:8080")

# Create class
client.schema.create_class({
    "class": "Document",
    "properties": [
        {"name": "text", "dataType": ["text"]},
        {"name": "category", "dataType": ["string"]}
    ],
    "vectorizer": "text2vec-openai"
})

Basic Operations

# Add data object
client.data_object.create(
    class_name="Document",
    data_object={
        "text": "Sample text",
        "category": "tech"
    }
)

# Query
results = client.query.get(
    "Document",
    ["text", "category"]
).with_near_vector({
    "vector": query_vector,
    "certainty": 0.7
}).with_limit(10).do()

# Delete
client.data_object.delete(
    class_name="Document",
    uuid=obj_id
)

Hybrid Search

results = client.query.get(
    "Document",
    ["text"]
).with_hybrid(
    query="search terms",
    vector=query_vector,
    alpha=0.5  # 0 = keyword, 1 = vector
).with_limit(10).do()

Qdrant

Setup

pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

# Connect
client = QdrantClient("localhost", port=6333)

# Create collection
client.create_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

Basic Operations

# Upsert points
client.upsert(
    collection_name="my_collection",
    points=[
        PointStruct(id=1, vector=[0.1, 0.2, ...], payload={"text": "sample"}),
        PointStruct(id=2, vector=[0.3, 0.4, ...], payload={"text": "example"})
    ]
)

# Search
results = client.search(
    collection_name="my_collection",
    query_vector=[0.1, 0.2, ...],
    limit=10,
    with_payload=True
)

# Delete
client.delete(
    collection_name="my_collection",
    points_selector=[1, 2]
)

Filtering

from qdrant_client.models import Filter

results = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    query_filter=Filter(
        must=[{"key": "category", "match": {"value": "tech"}}]
    )
)

Embedding Generation

OpenAI Embeddings

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key="your-key")
vector = embeddings.embed_query("Your text here")

Sentence Transformers

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
vectors = model.encode(["text1", "text2"])

Hugging Face

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
model = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')

# Generate embeddings
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
embeddings = model(**inputs).last_hidden_state.mean(dim=1)

Best Practices

1. Embedding Strategy

  • Domain-specific: Use models fine-tuned on your domain
  • Multilingual: Use multilingual models for international content
  • Batch processing: Embed in batches for efficiency

2. Chunking Strategies

# Fixed size
chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]

# Semantic
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)

3. Metadata Design

  • Store relevant filtering fields
  • Include timestamps, sources, categories
  • Keep metadata lightweight

4. Index Optimization

  • Pinecone: Choose appropriate pod type (s1 vs p1)
  • Weaviate: Use HNSW for fast approximate search
  • Qdrant: Tune quantization for memory efficiency

5. Query Optimization

# Hybrid search (vector + keyword)
# Re-ranking
# Filtering before vector search
# Caching frequent queries

RAG Integration

End-to-End Pipeline

from langchain_community.vectorstores import Pinecone
from langchain_openai import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Vector store
vectorstore = Pinecone.from_documents(
    documents=documents,
    embedding=OpenAIEmbeddings(),
    index_name="rag-index"
)

# RAG chain
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4"),
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3})
)

Advanced RAG

# Multi-query retrieval
from langchain.retrievers import MultiQueryRetriever

retriever = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(),
    llm=ChatOpenAI()
)

# Contextual compression
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

compressor = LLMChainExtractor.from_llm(ChatOpenAI())
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=vectorstore.as_retriever()
)

Performance Tuning

Pinecone

  • Use namespaces for multi-tenancy
  • Batch upsert (max 100 vectors per batch)
  • Choose pod type based on needs (s1 for storage, p1 for performance)

Weaviate

  • Tune HNSW parameters (ef_construction, M, ef)
  • Enable replication for high availability
  • Use sharding for large datasets

Qdrant

  • Enable quantization for memory savings
  • Use optimizers for index building
  • Tune search parameters (hnsw_ef, payload_index)

Comparison

FeaturePineconeWeaviateQdrant
ManagedSelf-hostedBoth
Open Source
Hybrid Search
Filtering
ScalabilityHighHighHigh
SetupEasiestMediumMedium

Common Patterns

Semantic Search

def semantic_search(query, top_k=5):
    query_vector = embeddings.embed_query(query)
    results = index.query(vector=query_vector, top_k=top_k)
    return results

Recommendation System

def find_similar_items(item_id, top_k=10):
    item_vector = get_item_vector(item_id)
    results = index.query(vector=item_vector, top_k=top_k)
    return results

Deduplication

def find_duplicates(text, threshold=0.95):
    vector = embeddings.embed_query(text)
    results = index.query(vector=vector, top_k=10)
    duplicates = [r for r in results if r.score > threshold]
    return duplicates

Integration

  • LangChain: All vector stores supported
  • LlamaIndex: Vector store integrations
  • Haystack: Document stores
  • Embedding Models: OpenAI, Cohere, Sentence Transformers

Source

git clone https://github.com/muhammederem/chief/blob/main/.claude/skills/ml-ai/vector_dbs/SKILL.mdView on GitHub

Overview

Vector databases store and query high-dimensional embeddings, enabling semantic search, recommendations, and RAG applications. They support text and image embeddings and attach metadata to each vector, powering precise retrieval and personalized experiences.

How This Skill Works

Embeddings convert content into dense high-dimensional vectors that capture meaning. A vector DB indexes these vectors with metadata and uses similarity metrics such as cosine similarity, Euclidean distance, or dot product to find nearest neighbors during a search. Core operations include upsert (index), search, delete, and update to keep results fresh.

When to Use It

  • Implement semantic search over documents, FAQs, or knowledge bases.
  • Build personalized recommendations using vector similarity.
  • Power Retrieval-Augmented Generation (RAG) with LLMs by feeding relevant context.
  • Perform multimodal search that combines text and image embeddings.
  • Enable hybrid search that blends keyword filters with vector proximity.

Quick Start

  1. Step 1: Install and connect to a vector DB client (Pinecone, Weaviate, or Qdrant) and set up your environment.
  2. Step 2: Generate embeddings for your data (e.g., OpenAI embeddings) and prepare metadata.
  3. Step 3: Create an index/collection with the correct vector size, upsert vectors with metadata, then run a nearest-neighbor query.

Best Practices

  • Choose embeddings that suit your data and language.
  • Align the dimension and distance metric with your similarity goal.
  • Store metadata to enable filtered or hybrid queries.
  • Batch upserts and monitor indexing performance and cost.
  • Validate results with end-to-end prompts and user feedback.

Example Use Cases

  • Semantic search over enterprise docs using Pinecone with 1536-dim embeddings and cosine metric.
  • Product recommendations driven by vector similarity in a recommender system.
  • RAG pipelines using Weaviate with text2vec-openai vectorization.
  • Image similarity search using Qdrant with 1536-dim vectors.
  • Hybrid search combining keyword queries with vector proximity in a Weaviate setup.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers