Get the FREE Ultimate OpenClaw Setup Guide →

llamaindex

npx machina-cli add skill G1Joshi/Agent-Skills/llamaindex --openclaw
Files (1)
SKILL.md
1.2 KB

LlamaIndex

LlamaIndex (formerly GPT Index) connects LLMs to your data. 2025 introduces Workflows, an event-driven way to build complex RAG pipelines.

When to Use

  • RAG (Retrieval Augmented Generation): Indexing PDFs, Docs, SQL to chat with them.
  • Structured Data: Querying SQL/Pandas with natural language (NLSQL).
  • Agents: Building research agents that browse the web and summarize.

Core Concepts

Workflows

Event-driven architecture for agents. Replace DAGs with event listeners (@step).

Query Engine

High-level API (index.as_query_engine()) to ask questions.

Data Loaders (LlamaHub)

Connectors for Notion, Slack, Discord, PDF, etc.

Best Practices (2025)

Do:

  • Use Workflows: They are harder to learn but easier to debug than monolithic engines.
  • Use Hybrid Search: BM25 (Keyword) + Vector Search for best retrieval accuracy.
  • Use Rerankers: Always rerank retrieved nodes (Cohere/BGE) before sending to LLM.

Don't:

  • Don't dump raw text: Use "Node Parsers" to chunk data intelligently (Markdown, Semantic).

References

Source

git clone https://github.com/G1Joshi/Agent-Skills/blob/main/skills/ai-ml/llamaindex/SKILL.mdView on GitHub

Overview

LlamaIndex connects LLMs to your data to enable Retrieval Augmented Generation and AI agents. The 2025 release introduces Workflows, an event-driven approach to building complex RAG pipelines.

How This Skill Works

Use index.as_query_engine() to interrogate indexed data with natural language. Data Loaders (LlamaHub) connect sources like Notion, Slack, Discord, and PDFs, while Workflows coordinates an event-driven pipeline and optional reranking before delivering answers.

When to Use It

  • RAG: index PDFs, docs, and SQL to chat with your data
  • Structured data: natural language queries over SQL or Pandas data (NLSQL)
  • Agents: build research agents that browse the web and summarize findings
  • Data integration: connect Notion, Slack, Discord, and PDFs via LlamaHub
  • End-to-end RAG pipelines: orchestrate data loading, retrieval, and answer generation with Workflows

Quick Start

  1. Step 1: Identify data sources and load them with LlamaHub connectors (Notion, Slack, PDF, etc.) to create an index
  2. Step 2: Build a Workflows-based RAG pipeline with event-driven steps (@step) for loading, retrieval, and reranking
  3. Step 3: Use index.as_query_engine() to ask questions and apply rerankers for higher-quality results

Best Practices

  • Use Workflows: modular, debuggable pipelines are easier to maintain than monolithic engines
  • Use Hybrid Search: combine BM25 keyword search with vector search for best retrieval accuracy
  • Use rerankers before sending results to the LLM to improve answer quality
  • Don't dump raw text: use Node Parsers to chunk data intelligently (Markdown, semantic)
  • Keep data sources connected and up-to-date via LlamaHub connectors for reliable loading

Example Use Cases

  • Index legal PDFs and contracts to enable quick QA and redlining in a corporate wiki
  • Index product docs and chat with them to support customer inquiries
  • Run NLQ over SQL/Pandas dashboards to extract metrics and insights
  • Deploy a research agent that browses the web and summarizes sources for literature reviews
  • Build an onboarding knowledge base in Notion and answer employee questions

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers