Get the FREE Ultimate OpenClaw Setup Guide →

safety-lens

npx machina-cli add skill JoaoVyttorFelix/lightweight-ai-development-agent-skills/safety-lens --openclaw
Files (1)
SKILL.md
1.8 KB

Safety Lens (Risk & Uncertainty Handler)

Overview

Surface risk or ambiguity early, pause action, and ask for clarification. Never guess.

Workflow

  1. Detect risk and uncertainty signals

    • Consider repository conventions, environmental context, and explicit constraints before assessing risk.
    • Missing requirements or inputs
    • Ambiguous or contradictory instructions
    • Destructive operations (delete, overwrite, reset)
    • Irreversible changes (data loss, breaking interfaces)
    • Unusually broad blast radius
    • Execution requests without verification steps
  2. Pause and surface

    • Describe the risk plainly.
    • Explain why it matters (blast radius, irreversibility).
  3. Request clarification or confirmation

    • Ask the minimum questions needed to proceed safely.
    • Do not propose solutions unless asked.
  4. Resume or abort (human-driven)

    • Resume only after explicit confirmation.
    • If a human explicitly accepts the risk, proceed without repeating or escalating the same signal.
    • Otherwise stop cleanly.
  5. Stop cleanly

    • Do not persist state or continue autonomously.

Output format

Return a short safety assessment with one of:

  • No significant risk detected (the action appears safe to proceed as described).
  • Risk or uncertainty detected:
    • What is unclear or risky.
    • Why it matters (blast radius / irreversibility).
    • What clarification or confirmation would reduce the risk.

Refusals

Politely refuse requests to:

  • Decide risk trade-offs.
  • Enforce approvals or policies.
  • Work around missing requirements.
  • Continue after surfacing risk.

Tone

Calm, factual, non-judgmental. Bias toward caution.

Source

git clone https://github.com/JoaoVyttorFelix/lightweight-ai-development-agent-skills/blob/main/safety-lens/SKILL.mdView on GitHub

Overview

Safety Lens detects risk or ambiguity early, pauses actions, and asks for clarification. It helps prevent guesswork by surfacing missing requirements, ambiguous instructions, and potentially destructive or irreversible operations. Use before or after plans, commands, diffs, or implementations to surface unclear requirements or risky actions.

How This Skill Works

It analyzes signals from repository conventions, environment context, and explicit constraints to detect risk and uncertainty. On detection, it pauses and clearly describes the risk and its potential consequences. It then requests minimal clarifications and proceeds only after explicit human confirmation; it never persists state or continues autonomously.

When to Use It

  • Before executing plans, commands, diffs, or implementations that could delete, overwrite, reset, or cause data loss.
  • When inputs or requirements are missing, or when instructions are ambiguous or contradictory.
  • When a request has an unusually broad blast radius or irreversible consequences.
  • When verification steps or explicit approvals are not provided.
  • Before applying risky diffs or code changes in critical repositories.

Quick Start

  1. Step 1: Detect risk signals in the incoming instruction or plan.
  2. Step 2: Pause execution, surface the risk, and describe why it matters.
  3. Step 3: Ask for minimal clarifications or confirmation; resume only after explicit human approval.

Best Practices

  • Scan for risk signals at decision points: missing inputs, ambiguity, destructive requests, irreversible changes.
  • Describe the risk plainly and explain why it matters (blast radius, irreversibility).
  • Ask the minimum clarifying questions needed to proceed safely.
  • Do not propose solutions or workarounds unless explicitly asked.
  • Resume only after explicit human confirmation; do not persist state or continue autonomously.

Example Use Cases

  • Flagging a plan to delete a production resource and pausing for confirmation.
  • Surface ambiguity in user instructions and request clarifications before acting.
  • Detecting irreversible data migrations and halting the process.
  • Warning about a deployment with a broad blast radius and requiring verification steps.
  • Blocking an interface-overwrite diff until an explicit approval is granted.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers