Get the FREE Ultimate OpenClaw Setup Guide →

llm-app-patterns

Scanned
npx machina-cli add skill karim-bhalwani/agent-skills-collection/llm-app-patterns --openclaw
Files (1)
SKILL.md
3.5 KB

LLM Application Patterns

Expert in production LLM application patterns and architectures.

When to Use This Skill

Use when:

  • Building production RAG (Retrieval-Augmented Generation) pipelines
  • Implementing AI agents with tool use and multi-step reasoning
  • Designing prompt engineering strategies and template systems
  • Setting up LLMOps: monitoring, logging, tracing, and evaluation
  • Deploying LLM applications with caching, rate limiting, and fallbacks
  • Choosing between different agent architectures (ReAct, function calling, plan-execute, multi-agent)
  • Optimizing retrieval: chunking strategies, vector databases, hybrid search
  • Building production-ready systems: cost optimization, reliability, observability

Core Capabilities

This skill provides production-proven patterns for:

  1. RAG Pipelines - Document ingestion, chunking, embedding, retrieval, generation
  2. Agent Architectures - ReAct, function calling, plan-execute, multi-agent collaboration
  3. Prompt Engineering - Templates, versioning, A/B testing, chaining
  4. LLMOps & Monitoring - Metrics, logging, tracing, evaluation frameworks
  5. Production Patterns - Caching, rate limiting, retry logic, fallbacks

Pattern References

For detailed implementation guidance, see:

RAG Pipelines

Use when: Building search-augmented LLM applications

Covers:

  • Document ingestion and preprocessing
  • Chunking strategies (fixed, semantic, sliding window)
  • Vector database selection and configuration
  • Retrieval patterns (dense, sparse, hybrid, multi-vector)
  • Generation with retrieved context

Agent Architectures

Use when: Building agents that use tools or multi-step reasoning

Covers:

  • ReAct pattern (Reasoning + Acting)
  • Function calling for structured tool use
  • Plan-and-execute for complex tasks
  • Multi-agent collaboration patterns
  • Architecture decision matrix

Prompt Engineering

Use when: Creating reusable prompt systems

Covers:

  • Prompt templates with variables
  • Versioning and A/B testing
  • Prompt chaining for multi-step workflows
  • Few-shot learning patterns
  • Best practices for prompt structure

LLMOps & Observability

Use when: Setting up monitoring and evaluation

Covers:

  • Key metrics to track (performance, quality, cost, reliability)
  • Logging and distributed tracing
  • Evaluation frameworks and benchmarking
  • Caching strategies for cost reduction
  • Rate limiting and retry patterns
  • Fallback strategies for reliability

Quick Decision Guide

GoalReference
Answer questions from your docsRAG Pipelines
Build tool-using agentAgent Architectures
Create reusable promptsPrompt Engineering
Monitor production systemLLMOps & Observability

Dependencies

  • architect - For overall system design and architecture decisions
  • data-modeler - For data schema design in RAG pipelines
  • ops-manager - For production deployment and operations

Source

git clone https://github.com/karim-bhalwani/agent-skills-collection/blob/main/skills/llm-app-patterns/SKILL.mdView on GitHub

Overview

A practical guide to production-grade LLM apps, covering RAG pipelines, agent architectures, prompt engineering, LLMOps, and deployment patterns. It explains when to use each pattern and how to implement them for reliability, scalability, and observability.

How This Skill Works

This skill distills production-grade patterns into clear components: RAG Pipelines, Agent Architectures, Prompt Engineering, LLMOps & Observability, and Production Patterns. It connects architecture choices to concrete techniques like chunking, vector stores, templates, caching, and monitoring to help you build robust LLM applications.

When to Use It

  • Building production RAG pipelines
  • Implementing AI agents with tool use and multi-step reasoning
  • Designing prompt engineering strategies and templates
  • Setting up LLMOps: monitoring, logging, tracing, and evaluation
  • Deploying LLM apps with caching, rate limiting, and fallbacks

Quick Start

  1. Step 1: Map requirements to architecture (RAG vs. agent) and select references
  2. Step 2: Set up data ingestion, chunking, vector store, and retrieval
  3. Step 3: Implement LLMOps scaffolding (logging, tracing, metrics) and basic deployment guards

Best Practices

  • Define a clear data flow for RAG: ingestion, chunking, embedding, retrieval, generation
  • Choose and justify an agent architecture (ReAct, function calling, plan-execute, multi-agent)
  • Implement robust prompt templates with versioning and A/B testing
  • Establish LLMOps with metrics, logging, tracing, and evaluation
  • Apply production patterns: caching, rate limiting, retry logic, and fallbacks

Example Use Cases

  • RAG-powered document search with hybrid vector search
  • Tool-using agent performing multi-step tasks
  • Versioned prompt templates library with chaining
  • Production deployment with caching and rate limits
  • Observability-driven LLM deployment with monitoring dashboards

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers