Get the FREE Ultimate OpenClaw Setup Guide →

airweave-setup

Scanned
npx machina-cli add skill airweave-ai/claude-plugin/airweave-setup --openclaw
Files (1)
SKILL.md
7.1 KB

Airweave Setup & Integration

Airweave is an open-source platform that makes any app searchable for AI agents. It connects to apps, productivity tools, databases, or document stores and transforms their contents into searchable knowledge bases.

Quick Start

Option 1: Airweave Cloud (Recommended)

  1. Sign up at https://app.airweave.ai
  2. Get your API key from the dashboard
  3. Install the SDK:
pip install airweave-sdk   # Python
npm install @airweave/sdk  # TypeScript

Option 2: Self-Hosted

git clone https://github.com/airweave-ai/airweave.git
cd airweave
chmod +x start.sh
./start.sh

Access the dashboard at http://localhost:8080

Core Workflow

The typical Airweave workflow follows these steps:

1. Create a Collection

A collection groups multiple data sources into a single searchable endpoint.

Python:

from airweave import AirweaveSDK

client = AirweaveSDK(
    api_key="YOUR_API_KEY",
    base_url="https://api.airweave.ai"  # or http://localhost:8001 for self-hosted
)

collection = client.collections.create(name="My Knowledge Base")
print(f"Collection ID: {collection.readable_id}")

TypeScript:

import { AirweaveSDKClient } from "@airweave/sdk";

const client = new AirweaveSDKClient({ apiKey: "YOUR_API_KEY" });
const collection = await client.collections.create({ name: "My Knowledge Base" });

2. Add Source Connections

Connect data sources to your collection. 40+ sources supported including:

  • Productivity: Notion, Google Drive/Docs/Slides, Dropbox, OneDrive, SharePoint, Box, Airtable
  • Communication: Slack, Gmail, Outlook, Teams, Google Calendar
  • Project Management: Jira, Linear, Asana, Trello, Monday, ClickUp, Todoist
  • Development: GitHub, GitLab, Bitbucket, Confluence
  • CRM & Sales: Salesforce, HubSpot, Attio, Zendesk, Pipedrive, Shopify
  • Data: Stripe, PostgreSQL

See SDK-REFERENCE.md for the complete list of source short names.

Python (API Key sources like Stripe, Linear):

source = client.source_connections.create(
    name="My Stripe Connection",
    short_name="stripe",
    readable_collection_id=collection.readable_id,
    authentication={
        "credentials": {"api_key": "sk_live_your_stripe_key"}
    }
)

OAuth sources (Slack, Google, Microsoft, etc.):

Most sources use OAuth for authentication. Use the Airweave UI at https://app.airweave.ai to connect these sources—it handles the OAuth flow automatically.

3. Search Your Data

Once synced, search across all connected sources with a single query:

Python:

results = client.collections.search(
    readable_id=collection.readable_id,
    query="customer feedback about pricing"
)

for result in results.results:
    print(f"Source: {result['payload']['source_name']}")
    print(f"Content: {result['payload']['md_content'][:200]}...")
    print(f"Score: {result['score']}")

Advanced Search Features

Airweave provides powerful search capabilities:

Search Parameters

ParameterTypeDescription
querystringNatural language search query
search_type"semantic" | "hybrid"Semantic (default) or hybrid search
limitnumberMax results (default: 100)
offsetnumberSkip results for pagination
recency_bias0-1Prioritize recent results (0=none, 1=most recent)
enable_rerankingbooleanAI reranking for better relevance
enable_query_expansionbooleanExpand query with variations
response_type"raw" | "completion"Raw results or AI-generated answer
top_knumberInternal retrieval count before reranking

Example: Advanced Search

results = client.collections.search(
    readable_id=collection.readable_id,
    query="technical documentation",
    search_type="hybrid",
    enable_query_expansion=True,
    enable_reranking=True,
    recency_bias=0.5,
    top_k=50,
    limit=20
)

Example: AI-Generated Answer

answer = client.collections.search(
    readable_id=collection.readable_id,
    query="What are our customer refund policies?",
    response_type="completion",
    enable_reranking=True
)
# Returns a synthesized answer instead of raw results

MCP Integration for AI Agents

Airweave exposes search via MCP (Model Context Protocol) for seamless AI agent integration.

Setup for Claude Desktop / Cursor

Add to your MCP configuration (~/.cursor/mcp.json):

{
  "mcpServers": {
    "airweave-search": {
      "command": "npx",
      "args": ["airweave-mcp-search"],
      "env": {
        "AIRWEAVE_API_KEY": "your-api-key",
        "AIRWEAVE_COLLECTION": "your-collection-id",
        "AIRWEAVE_BASE_URL": "https://api.airweave.ai"
      }
    }
  }
}

Hosted MCP Server

For cloud-based AI platforms like OpenAI Agent Builder:

  • URL: https://mcp.airweave.ai
  • Uses Streamable HTTP transport (MCP 2025-03-26)

See MCP-SETUP.md for detailed configuration.

Common Patterns

Pattern 1: Search with Source Filtering

from airweave import SearchRequest, Filter, FieldCondition, MatchAny

search_request = SearchRequest(
    query="project updates",
    filter=Filter(
        must=[
            FieldCondition(
                key="source_name",
                match=MatchAny(any=["Slack", "GitHub"])
            )
        ]
    )
)

results = client.collections.search_advanced(
    readable_id=collection.readable_id,
    search_request=search_request
)

Pattern 2: Recent Documents First

results = client.collections.search(
    readable_id=collection.readable_id,
    query="critical bugs",
    recency_bias=0.8,  # Strongly prefer recent
    limit=10
)

Pattern 3: High-Quality Results with Reranking

results = client.collections.search(
    readable_id=collection.readable_id,
    query="API documentation",
    enable_reranking=True,
    top_k=30,
    limit=10
)

Troubleshooting

No Results Found

  • Check that sync has completed (can take a few minutes for large sources)
  • Verify the collection ID is correct
  • Try a broader search query

Authentication Errors

  • Verify your API key is valid
  • Check that the API key has access to the collection
  • For OAuth sources, the token may have expired—reconnect in the UI

Rate Limits

  • The API has rate limits for protection
  • Implement exponential backoff for retries
  • Contact support for higher limits

Additional Resources

For detailed SDK reference, see SDK-REFERENCE.md. For advanced search patterns, see SEARCH-PATTERNS.md.

Source

git clone https://github.com/airweave-ai/claude-plugin/blob/main/skills/airweave-setup/SKILL.mdView on GitHub

Overview

Airweave is an open-source platform that makes any app searchable for AI agents by turning connected data sources into a unified knowledge base. It supports cloud and self-hosted deployments and provides SDKs for Python and TypeScript to build searchable collections. This skill guides developers through installing Airweave, creating collections, connecting sources, and configuring the SDK integration in their apps.

How This Skill Works

Install the Airweave SDK (cloud via API key or self-hosted) and create a collection to group data sources. Connect sources (40+ options) to that collection, using API keys or OAuth flows managed in the UI. Use the SDK to search across all connected sources with configurable parameters (semantic or hybrid search, recency, reranking, etc.) to retrieve unified results.

When to Use It

  • You want to install Airweave in your application and start an integration quickly (cloud or self-hosted).
  • You need to create a single, centralized collection that groups multiple data sources into one searchable endpoint.
  • You want to connect 40+ sources (Notion, Google Drive, Slack, Jira, GitHub, Stripe, etc.) to enable cross-source search.
  • You aim to perform searches across all connected data with a single query using semantic or hybrid search modes.
  • You plan to configure MCP servers or customize deployment (cloud signup vs. self-hosted) and set up the SDK for your app.

Quick Start

  1. Step 1: Decide Cloud (recommended) or Self-Hosted, then sign up at the Airweave dashboard or clone the repo if self-hosting.
  2. Step 2: Install the SDK: Python users run 'pip install airweave-sdk'; TypeScript users run 'npm install @airweave/sdk'.
  3. Step 3: Create a collection and add source connections, then start querying across connected sources.

Best Practices

  • Create a dedicated collection per app or knowledge base to keep data organized and manageable.
  • Prefer Airweave Cloud for fastest setup; consider self-hosting when data sovereignty is required.
  • Use OAuth for source connections where available to simplify authentication and renewal.
  • Regularly review and refresh source connections as data sources update or expire credentials.
  • Leverage search parameters (search_type, limit, recency_bias, enable_reranking) to tune relevance and performance.

Example Use Cases

  • A product team connects Notion, Google Drive, and GitHub into one collection to answer customer- and product-related questions with AI agents.
  • An ops team uses OAuth-connected Slack and Gmail sources to enable AI-assisted retrieval across chats and emails.
  • A developer sets up Airweave Cloud, creates a 'Knowledge Base' collection, and uses the Python SDK to run cross-source searches.
  • A data-heavy organization runs a self-hosted Airweave instance to consolidate Stripe and PostgreSQL data for internal AI search.
  • A project management unit links Jira, Trello, and Confluence to provide AI-powered project status and history queries.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers