What inputs are required?

haystack_pipeline (Pipeline), trainset (list[dspy.Example]), and metric (callable).

What outputs are produced?

optimized_prompt (str) and optimized_pipeline (Pipeline).

How long does optimization take?

Duration depends on trainset size, metric complexity, and compute; start with a small pilot to estimate runtime.

dspy-haystack-integration

Scanned

npx machina-cli add skill OmidZamani/dspy-skills/dspy-haystack-integration --openclaw

Files (1)

SKILL.md

5.5 KB

DSPy + Haystack Integration

Goal

Use DSPy's optimization capabilities to automatically improve prompts in Haystack pipelines.

When to Use

You have existing Haystack pipelines
Manual prompt tuning is tedious
Need data-driven prompt optimization
Want to combine Haystack components with DSPy optimization

Inputs

Input	Type	Description
`haystack_pipeline`	`Pipeline`	Existing Haystack pipeline
`trainset`	`list[dspy.Example]`	Training examples
`metric`	`callable`	Evaluation function

Outputs

Output	Type	Description
`optimized_prompt`	`str`	DSPy-optimized prompt
`optimized_pipeline`	`Pipeline`	Updated Haystack pipeline

Workflow

Phase 1: Build Initial Haystack Pipeline

from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore

# Setup document store
doc_store = InMemoryDocumentStore()
doc_store.write_documents(documents)

# Initial generic prompt
initial_prompt = """
Context: {{context}}
Question: {{question}}
Answer:
"""

# Build pipeline
pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=doc_store))
pipeline.add_component("prompt_builder", PromptBuilder(template=initial_prompt))
pipeline.add_component("generator", OpenAIGenerator(model="gpt-4o-mini"))

pipeline.connect("retriever", "prompt_builder.context")
pipeline.connect("prompt_builder", "generator")

Phase 2: Create DSPy RAG Module

import dspy

class HaystackRAG(dspy.Module):
    """DSPy module wrapping Haystack retriever."""
    
    def __init__(self, retriever, k=3):
        super().__init__()
        self.retriever = retriever
        self.k = k
        self.generate = dspy.ChainOfThought("context, question -> answer")
    
    def forward(self, question):
        # Use Haystack retriever
        results = self.retriever.run(query=question, top_k=self.k)
        context = [doc.content for doc in results['documents']]
        
        # Use DSPy for generation
        pred = self.generate(context=context, question=question)
        return dspy.Prediction(context=context, answer=pred.answer)

Phase 3: Define Custom Metric

from haystack.components.evaluators import SASEvaluator

# Haystack semantic evaluator
sas_evaluator = SASEvaluator(model="sentence-transformers/all-MiniLM-L6-v2")

def mixed_metric(example, pred, trace=None):
    """Combine semantic accuracy with conciseness."""
    
    # Semantic similarity (Haystack SAS)
    sas_result = sas_evaluator.run(
        ground_truth_answers=[example.answer],
        predicted_answers=[pred.answer]
    )
    semantic_score = sas_result['score']
    
    # Conciseness penalty
    word_count = len(pred.answer.split())
    conciseness = 1.0 if word_count <= 20 else max(0, 1 - (word_count - 20) / 50)
    
    return 0.7 * semantic_score + 0.3 * conciseness

Phase 4: Optimize with DSPy

from dspy.teleprompt import BootstrapFewShot

lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)

# Create DSPy module with Haystack retriever
rag_module = HaystackRAG(retriever=pipeline.get_component("retriever"))

# Optimize
optimizer = BootstrapFewShot(
    metric=mixed_metric,
    max_bootstrapped_demos=4,
    max_labeled_demos=4
)

compiled = optimizer.compile(rag_module, trainset=trainset)

Phase 5: Extract and Apply Optimized Prompt

After optimization, extract the optimized prompt and apply it to your Haystack pipeline.

See Prompt Extraction Guide for detailed steps on:

Extracting prompts from compiled DSPy modules
Mapping DSPy demos to Haystack templates
Building optimized Haystack pipelines

Production Example

For a complete production-ready implementation, see HaystackDSPyOptimizer.

This class provides:

Wrapper for Haystack retrievers in DSPy modules
Automatic optimization with BootstrapFewShot
Prompt extraction and Haystack pipeline rebuilding
Complete usage example with document store setup

Best Practices

Match retrievers - Use same retriever in DSPy module as Haystack pipeline
Custom metrics - Combine Haystack evaluators with DSPy optimization
Prompt extraction - Carefully map DSPy demos to Haystack template format
Test both - Validate DSPy module AND final Haystack pipeline

Limitations

Prompt template conversion can be tricky
Some Haystack features don't map directly to DSPy
Requires maintaining two codebases initially
Complex pipelines may need custom integration

Official Documentation

DSPy Documentation: https://dspy.ai/
DSPy GitHub: https://github.com/stanfordnlp/dspy
Haystack Documentation: https://docs.haystack.deepset.ai/

Source

git clone https://github.com/OmidZamani/dspy-skills/blob/master/skills/dspy-haystack-integration/SKILL.mdView on GitHub

Overview

This skill integrates DSPy with Haystack to automatically improve prompts and optimize Haystack pipelines. It takes an existing Haystack pipeline, a training set of DSPy examples, and a scoring metric to produce an optimized prompt and an updated pipeline, enabling data-driven, concise prompts within RAG workflows.

How This Skill Works

Start with an existing Haystack pipeline and define a training set and metric. Build a DSPy module that wraps the Haystack retriever and generates answers, then define a custom metric combining semantic quality with conciseness. Run the DSPy optimization to output an optimized_prompt and an updated optimized_pipeline that blends Haystack components with DSPy-generated prompts.

When to Use It

You have existing Haystack pipelines and want to improve prompts end-to-end.
Manual prompt tuning is tedious, error-prone, or inconsistent.
You want data-driven prompt optimization guided by a trainset and a defined metric.
You need to blend Haystack components (retriever, generator) with a DSPy-optimized prompt.
You want automatic prompt improvement integrated into an existing Haystack pipeline.

Quick Start

Step 1: Prepare haystack_pipeline, a trainset of dspy.Example, and a scoring metric.
Step 2: Build a DSPy module wrapping the Haystack retriever and a generator; define a mixed metric.
Step 3: Run the DSPy optimization to obtain optimized_prompt and updated optimized_pipeline.

Best Practices

Use a representative trainset of dspy.Example that covers diverse questions and contexts.
Define a clear metric that balances semantic relevance with conciseness.
Start with a sensible initial_prompt to bootstrap the optimization process.
Iterate in small increments and monitor for overfitting to the trainset.
Thoroughly validate the optimized_pipeline on a held-out dataset before deployment.

Example Use Cases

Haystack QA pipeline for customer support with DSPy-optimized prompts.
Academic literature QA pipeline using data-driven prompt refinement.
E-commerce product FAQ retrieval with concise, accurate answers.
Legal document search improved by optimized prompts in RAG setup.
Technical support chatbot leveraging DSPy to tune Haystack prompts.

Frequently Asked Questions

Add this skill to your agents