What document stores are supported?

DocumentStores like Elasticsearch, Weaviate, FAISS, etc.

What pipeline types can I configure?

Retrieval pipelines, RAG pipelines, Evaluation pipelines, and Indexing pipelines.

What are the dependencies I need to install?

haystack-ai and farm-haystack (legacy).

haystack-pipeline

Scanned

npx machina-cli add skill a5c-ai/babysitter/haystack-pipeline --openclaw

Files (1)

SKILL.md

1.2 KB

Haystack Pipeline Skill

Capabilities

Configure Haystack pipeline components
Set up document stores and retrievers
Implement reader/generator models
Design custom pipeline graphs
Configure preprocessing pipelines
Implement evaluation pipelines

Target Processes

rag-pipeline-implementation
intent-classification-system

Implementation Details

Core Components

DocumentStores: Elasticsearch, Weaviate, FAISS, etc.
Retrievers: BM25, Dense, Hybrid
Readers/Generators: Extractive and generative QA
Preprocessors: Document cleaning and splitting

Pipeline Types

Retrieval pipelines
RAG pipelines
Evaluation pipelines
Indexing pipelines

Configuration Options

Component selection
Pipeline graph design
Document store backend
Model selection
Preprocessing settings

Best Practices

Modular pipeline design
Proper preprocessing
Evaluation integration
Component versioning

Dependencies

haystack-ai
farm-haystack (legacy)

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/haystack-pipeline/SKILL.md

View on GitHub

Overview

This skill configures Haystack NLP pipelines for document processing and QA. It covers setting up document stores and retrievers, integrating readers or generators, and designing custom pipeline graphs with preprocessing and evaluation steps.

How This Skill Works

You select core components (DocumentStore, Retriever, Reader/Generator), design a pipeline graph, and configure preprocessing. Then you pick a backend (Elasticsearch, Weaviate, FAISS, etc.) and appropriate models for retrieval and QA, building retrieval, RAG, indexing, or evaluation pipelines as needed.

When to Use It

When building a retrieval-augmented QA system over a knowledge base
When indexing a large document corpus for fast search
When evaluating QA models and retrievers with an integrated pipeline
When designing custom multi-stage pipelines with modular components
When switching document stores or backends to optimize performance or scale

Quick Start

Step 1: Choose a DocumentStore backend (e.g., Elasticsearch, Weaviate, or FAISS) and select a retriever and QA model
Step 2: Design a pipeline graph (retrieve -> reader/generator) and configure preprocessing (cleaning and splitting)
Step 3: Run the pipeline, review results, and iterate on components and settings

Best Practices

Keep pipelines modular and swap components independently
Apply proper preprocessing: cleaning and splitting documents
Integrate evaluation steps to monitor QA accuracy over time
Version-control component configurations and models
Test pipelines locally before production deployment

Example Use Cases

Building a RAG QA system over a company's knowledge base using a BM25 retriever and a generative reader
Setting up a FAISS-based indexing pipeline for fast similarity search over a large corpus
An extractive QA setup with BM25 or dense retrievers over product manuals
A generative QA workflow combining a retriever with a generator for open-ended answers
An evaluation pipeline that compares QA models and retrievers to track improvements

Frequently Asked Questions

Add this skill to your agents