Get the FREE Ultimate OpenClaw Setup Guide →

implementing-error-handling

npx machina-cli add skill WesleySmits/agent-skills/error-handling-patterns --openclaw
Files (1)
SKILL.md
8.7 KB

Error Handling Patterns

Build resilient applications with robust error handling strategies that gracefully handle failures and provide excellent debugging experiences.

When to Use This Skill

  • Implementing error handling in new features
  • Designing error-resilient APIs
  • Debugging production issues
  • Improving application reliability
  • Creating better error messages for users and developers
  • Implementing retry and circuit breaker patterns
  • Handling async/concurrent errors
  • Building fault-tolerant distributed systems

Core Concepts

1. Error Handling Philosophies

Exceptions vs Result Types:

  • Exceptions: Traditional try-catch, disrupts control flow
  • Result Types: Explicit success/failure, functional approach
  • Error Codes: C-style, requires discipline
  • Option/Maybe Types: For nullable values

When to Use Each:

  • Exceptions: Unexpected errors, exceptional conditions
  • Result Types: Expected errors, validation failures
  • Panics/Crashes: Unrecoverable errors, programming bugs

2. Error Categories

Recoverable Errors:

  • Network timeouts
  • Missing files
  • Invalid user input
  • API rate limits

Unrecoverable Errors:

  • Out of memory
  • Stack overflow
  • Programming bugs (null pointer, etc.)

Language-Specific Patterns

For detailed code examples in Python, TypeScript, Rust, and Go, see: 👉 examples/language-patterns.md

Universal Patterns

Pattern 1: Circuit Breaker

Prevent cascading failures in distributed systems.

from enum import Enum
from datetime import datetime, timedelta
from typing import Callable, TypeVar

T = TypeVar('T')

class CircuitState(Enum):
    CLOSED = "closed"       # Normal operation
    OPEN = "open"          # Failing, reject requests
    HALF_OPEN = "half_open"  # Testing if recovered

class CircuitBreaker:
    def __init__(
        self,
        failure_threshold: int = 5,
        timeout: timedelta = timedelta(seconds=60),
        success_threshold: int = 2
    ):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.success_threshold = success_threshold
        self.failure_count = 0
        self.success_count = 0
        self.state = CircuitState.CLOSED
        self.last_failure_time = None

    def call(self, func: Callable[[], T]) -> T:
        if self.state == CircuitState.OPEN:
            if datetime.now() - self.last_failure_time > self.timeout:
                self.state = CircuitState.HALF_OPEN
                self.success_count = 0
            else:
                raise Exception("Circuit breaker is OPEN")

        try:
            result = func()
            self.on_success()
            return result
        except Exception as e:
            self.on_failure()
            raise

    def on_success(self):
        self.failure_count = 0
        if self.state == CircuitState.HALF_OPEN:
            self.success_count += 1
            if self.success_count >= self.success_threshold:
                self.state = CircuitState.CLOSED
                self.success_count = 0

    def on_failure(self):
        self.failure_count += 1
        self.last_failure_time = datetime.now()
        if self.failure_count >= self.failure_threshold:
            self.state = CircuitState.OPEN

# Usage
circuit_breaker = CircuitBreaker()

def fetch_data():
    return circuit_breaker.call(lambda: external_api.get_data())

Pattern 2: Error Aggregation

Collect multiple errors instead of failing on first error.

class ErrorCollector {
  private errors: Error[] = [];

  add(error: Error): void {
    this.errors.push(error);
  }

  hasErrors(): boolean {
    return this.errors.length > 0;
  }

  getErrors(): Error[] {
    return [...this.errors];
  }

  throw(): never {
    if (this.errors.length === 1) {
      throw this.errors[0];
    }
    throw new AggregateError(
      this.errors,
      `${this.errors.length} errors occurred`,
    );
  }
}

// Usage: Validate multiple fields
function validateUser(data: any): User {
  const errors = new ErrorCollector();

  if (!data.email) {
    errors.add(new ValidationError("Email is required"));
  } else if (!isValidEmail(data.email)) {
    errors.add(new ValidationError("Email is invalid"));
  }

  if (!data.name || data.name.length < 2) {
    errors.add(new ValidationError("Name must be at least 2 characters"));
  }

  if (!data.age || data.age < 18) {
    errors.add(new ValidationError("Age must be 18 or older"));
  }

  if (errors.hasErrors()) {
    errors.throw();
  }

  return data as User;
}

Pattern 3: Graceful Degradation

Provide fallback functionality when errors occur.

from typing import Optional, Callable, TypeVar

T = TypeVar('T')

def with_fallback(
    primary: Callable[[], T],
    fallback: Callable[[], T],
    log_error: bool = True
) -> T:
    """Try primary function, fall back to fallback on error."""
    try:
        return primary()
    except Exception as e:
        if log_error:
            logger.error(f"Primary function failed: {e}")
        return fallback()

# Usage
def get_user_profile(user_id: str) -> UserProfile:
    return with_fallback(
        primary=lambda: fetch_from_cache(user_id),
        fallback=lambda: fetch_from_database(user_id)
    )

# Multiple fallbacks
def get_exchange_rate(currency: str) -> float:
    return (
        try_function(lambda: api_provider_1.get_rate(currency))
        or try_function(lambda: api_provider_2.get_rate(currency))
        or try_function(lambda: cache.get_rate(currency))
        or DEFAULT_RATE
    )

def try_function(func: Callable[[], Optional[T]]) -> Optional[T]:
    try:
        return func()
    except Exception:
        return None

Best Practices

  1. Fail Fast: Validate input early, fail quickly
  2. Preserve Context: Include stack traces, metadata, timestamps
  3. Meaningful Messages: Explain what happened and how to fix it
  4. Log Appropriately: Error = log, expected failure = don't spam logs
  5. Handle at Right Level: Catch where you can meaningfully handle
  6. Clean Up Resources: Use try-finally, context managers, defer
  7. Don't Swallow Errors: Log or re-throw, don't silently ignore
  8. Type-Safe Errors: Use typed errors when possible
# Good error handling example
def process_order(order_id: str) -> Order:
    """Process order with comprehensive error handling."""
    try:
        # Validate input
        if not order_id:
            raise ValidationError("Order ID is required")

        # Fetch order
        order = db.get_order(order_id)
        if not order:
            raise NotFoundError("Order", order_id)

        # Process payment
        try:
            payment_result = payment_service.charge(order.total)
        except PaymentServiceError as e:
            # Log and wrap external service error
            logger.error(f"Payment failed for order {order_id}: {e}")
            raise ExternalServiceError(
                f"Payment processing failed",
                service="payment_service",
                details={"order_id": order_id, "amount": order.total}
            ) from e

        # Update order
        order.status = "completed"
        order.payment_id = payment_result.id
        db.save(order)

        return order

    except ApplicationError:
        # Re-raise known application errors
        raise
    except Exception as e:
        # Log unexpected errors
        logger.exception(f"Unexpected error processing order {order_id}")
        raise ApplicationError(
            "Order processing failed",
            code="INTERNAL_ERROR"
        ) from e

Common Pitfalls

  • Catching Too Broadly: except Exception hides bugs
  • Empty Catch Blocks: Silently swallowing errors
  • Logging and Re-throwing: Creates duplicate log entries
  • Not Cleaning Up: Forgetting to close files, connections
  • Poor Error Messages: "Error occurred" is not helpful
  • Returning Error Codes: Use exceptions or Result types
  • Ignoring Async Errors: Unhandled promise rejections

Resources

  • references/exception-hierarchy-design.md: Designing error class hierarchies
  • references/error-recovery-strategies.md: Recovery patterns for different scenarios
  • references/async-error-handling.md: Handling errors in concurrent code
  • assets/error-handling-checklist.md: Review checklist for error handling
  • assets/error-message-guide.md: Writing helpful error messages
  • scripts/error-analyzer.py: Analyze error patterns in logs

Source

git clone https://github.com/WesleySmits/agent-skills/blob/main/.agent/skills/error-handling-patterns/SKILL.mdView on GitHub

Overview

Learn error handling patterns across languages—exceptions, Result types, error propagation, and graceful degradation. It also covers universal patterns like circuit breakers and error aggregation to boost reliability and debuggability.

How This Skill Works

This skill starts with core concepts (exceptions vs. result types, error codes, and option/maybe types), then guides you to choose the right pattern for each scenario. It introduces universal patterns such as circuit breakers and error aggregation, complemented by language-specific examples in Python, TypeScript, Rust, and Go.

When to Use It

  • Implementing error handling in new features
  • Designing error-resilient APIs
  • Debugging production issues
  • Implementing retry and circuit breaker patterns
  • Building fault-tolerant distributed systems

Quick Start

  1. Step 1: Identify error scenarios and categorize them as recoverable vs unrecoverable
  2. Step 2: Choose a pattern (e.g., circuit breaker or error aggregation) and define a clear error model
  3. Step 3: Implement, test error paths, and add monitoring/alerts to catch regressions

Best Practices

  • Choose error-handling strategy aligned with error type (exceptions for unexpected errors; Result/Option for expected failures)
  • Define a centralized, explicit error model with meaningful messages and error codes
  • Propagate context-rich errors without leaking internal implementation details
  • Incorporate resilience patterns like retries with backoff and circuit breakers
  • Test error paths with focused unit/integration tests and realistic failure scenarios

Example Use Cases

  • Circuit Breaker pattern applied to external API calls in Python to prevent cascading failures
  • Error Aggregation pattern collecting multiple validation errors in TypeScript
  • Graceful degradation when a microservice becomes unavailable
  • Explicit error propagation using Result types in Rust for safer error handling
  • API error codes and structured messages used consistently across services in Go

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers