Get the FREE Ultimate OpenClaw Setup Guide →

nemo-guardrails

Scanned
npx machina-cli add skill a5c-ai/babysitter/nemo-guardrails --openclaw
Files (1)
SKILL.md
1.1 KB

NeMo Guardrails Skill

Capabilities

  • Configure NeMo Guardrails rails
  • Design Colang conversation flows
  • Implement input/output rails
  • Set up topic control
  • Configure jailbreak detection
  • Implement fact-checking rails

Target Processes

  • system-prompt-guardrails
  • content-moderation-safety

Implementation Details

Rail Types

  1. Input Rails: Filter user inputs
  2. Output Rails: Filter LLM outputs
  3. Dialog Rails: Control conversation flow
  4. Retrieval Rails: Filter retrieved content
  5. Execution Rails: Control action execution

Colang Components

  • Flow definitions
  • Bot message templates
  • User message patterns
  • Actions and subflows

Configuration Options

  • Rails configuration
  • LLM selection
  • Embedding model
  • Action handlers
  • Custom rail implementations

Best Practices

  • Start with built-in rails
  • Design clear flows
  • Test with adversarial inputs
  • Monitor rail activations

Dependencies

  • nemoguardrails

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/nemo-guardrails/SKILL.mdView on GitHub

Overview

Configures NeMo Guardrails rails to enforce safety and content moderation in conversations. It covers input, output, dialog, retrieval, and execution rails, plus Colang-based flows, bot templates, and user patterns for precise control. By enabling topic control, jailbreak detection, and fact-checking rails, it helps teams deploy compliant, safer AI assistants.

How This Skill Works

Define rail types (input, output, dialog, retrieval, execution) and Colang components (flows, templates, patterns, actions) to govern prompts and responses. Configure rails options including LLM selection, embedding model, and action handlers, then deploy within the target guardrails processes (system-prompt-guardrails and content-moderation-safety) to enforce policy during conversations.

When to Use It

  • Deploying a customer-support bot with safety and factuality requirements
  • Detecting jailbreak attempts and moderating content
  • Designing Colang-driven flows and user/message patterns
  • Integrating with a specific LLM, embedding model, and action handlers
  • Testing rails with adversarial inputs and monitoring activations

Quick Start

  1. Step 1: Install and import nemoguardrails and integrate the nemo-guardrails skill into your agent
  2. Step 2: Define Colang flows and rails (input, output, dialog, retrieval, execution) using the skill's components
  3. Step 3: Configure rails options (LLM, embedding model, action handlers) and test against adversarial prompts while monitoring activations

Best Practices

  • Start with built-in rails
  • Design clear flows
  • Test with adversarial inputs
  • Monitor rail activations
  • Reuse Colang components to build maintainable rails

Example Use Cases

  • A banking support bot that blocks sensitive prompts and flags high-risk input for manual review
  • A healthcare assistant that filters medical advice and fact-checks before responding
  • An e-commerce assistant that stays on product topics and avoids jailbreaking prompts
  • An internal policy bot that enforces enterprise guidelines and logs guardrail activations
  • A research assistant that retrieves sources and validates facts against trusted databases

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers