llm-app-patterns
Scannednpx machina-cli add skill karim-bhalwani/agent-skills-collection/llm-app-patterns --openclawLLM Application Patterns
Expert in production LLM application patterns and architectures.
When to Use This Skill
Use when:
- Building production RAG (Retrieval-Augmented Generation) pipelines
- Implementing AI agents with tool use and multi-step reasoning
- Designing prompt engineering strategies and template systems
- Setting up LLMOps: monitoring, logging, tracing, and evaluation
- Deploying LLM applications with caching, rate limiting, and fallbacks
- Choosing between different agent architectures (ReAct, function calling, plan-execute, multi-agent)
- Optimizing retrieval: chunking strategies, vector databases, hybrid search
- Building production-ready systems: cost optimization, reliability, observability
Core Capabilities
This skill provides production-proven patterns for:
- RAG Pipelines - Document ingestion, chunking, embedding, retrieval, generation
- Agent Architectures - ReAct, function calling, plan-execute, multi-agent collaboration
- Prompt Engineering - Templates, versioning, A/B testing, chaining
- LLMOps & Monitoring - Metrics, logging, tracing, evaluation frameworks
- Production Patterns - Caching, rate limiting, retry logic, fallbacks
Pattern References
For detailed implementation guidance, see:
RAG Pipelines
Use when: Building search-augmented LLM applications
Covers:
- Document ingestion and preprocessing
- Chunking strategies (fixed, semantic, sliding window)
- Vector database selection and configuration
- Retrieval patterns (dense, sparse, hybrid, multi-vector)
- Generation with retrieved context
Agent Architectures
Use when: Building agents that use tools or multi-step reasoning
Covers:
- ReAct pattern (Reasoning + Acting)
- Function calling for structured tool use
- Plan-and-execute for complex tasks
- Multi-agent collaboration patterns
- Architecture decision matrix
Prompt Engineering
Use when: Creating reusable prompt systems
Covers:
- Prompt templates with variables
- Versioning and A/B testing
- Prompt chaining for multi-step workflows
- Few-shot learning patterns
- Best practices for prompt structure
LLMOps & Observability
Use when: Setting up monitoring and evaluation
Covers:
- Key metrics to track (performance, quality, cost, reliability)
- Logging and distributed tracing
- Evaluation frameworks and benchmarking
- Caching strategies for cost reduction
- Rate limiting and retry patterns
- Fallback strategies for reliability
Quick Decision Guide
| Goal | Reference |
|---|---|
| Answer questions from your docs | RAG Pipelines |
| Build tool-using agent | Agent Architectures |
| Create reusable prompts | Prompt Engineering |
| Monitor production system | LLMOps & Observability |
Dependencies
- architect - For overall system design and architecture decisions
- data-modeler - For data schema design in RAG pipelines
- ops-manager - For production deployment and operations
Source
git clone https://github.com/karim-bhalwani/agent-skills-collection/blob/main/skills/llm-app-patterns/SKILL.mdView on GitHub Overview
A practical guide to production-grade LLM apps, covering RAG pipelines, agent architectures, prompt engineering, LLMOps, and deployment patterns. It explains when to use each pattern and how to implement them for reliability, scalability, and observability.
How This Skill Works
This skill distills production-grade patterns into clear components: RAG Pipelines, Agent Architectures, Prompt Engineering, LLMOps & Observability, and Production Patterns. It connects architecture choices to concrete techniques like chunking, vector stores, templates, caching, and monitoring to help you build robust LLM applications.
When to Use It
- Building production RAG pipelines
- Implementing AI agents with tool use and multi-step reasoning
- Designing prompt engineering strategies and templates
- Setting up LLMOps: monitoring, logging, tracing, and evaluation
- Deploying LLM apps with caching, rate limiting, and fallbacks
Quick Start
- Step 1: Map requirements to architecture (RAG vs. agent) and select references
- Step 2: Set up data ingestion, chunking, vector store, and retrieval
- Step 3: Implement LLMOps scaffolding (logging, tracing, metrics) and basic deployment guards
Best Practices
- Define a clear data flow for RAG: ingestion, chunking, embedding, retrieval, generation
- Choose and justify an agent architecture (ReAct, function calling, plan-execute, multi-agent)
- Implement robust prompt templates with versioning and A/B testing
- Establish LLMOps with metrics, logging, tracing, and evaluation
- Apply production patterns: caching, rate limiting, retry logic, and fallbacks
Example Use Cases
- RAG-powered document search with hybrid vector search
- Tool-using agent performing multi-step tasks
- Versioned prompt templates library with chaining
- Production deployment with caching and rate limits
- Observability-driven LLM deployment with monitoring dashboards