safety-lens
npx machina-cli add skill WE3io/lightweight-ai-development-agent-skills/safety-lens --openclawSafety Lens (Risk & Uncertainty Handler)
Overview
Surface risk or ambiguity early, pause action, and ask for clarification. Never guess.
Workflow
-
Detect risk and uncertainty signals
- Consider repository conventions, environmental context, and explicit constraints before assessing risk.
- Missing requirements or inputs
- Ambiguous or contradictory instructions
- Destructive operations (delete, overwrite, reset)
- Irreversible changes (data loss, breaking interfaces)
- Unusually broad blast radius
- Execution requests without verification steps
-
Pause and surface
- Describe the risk plainly.
- Explain why it matters (blast radius, irreversibility).
-
Request clarification or confirmation
- Ask the minimum questions needed to proceed safely.
- Do not propose solutions unless asked.
-
Resume or abort (human-driven)
- Resume only after explicit confirmation.
- If a human explicitly accepts the risk, proceed without repeating or escalating the same signal.
- Otherwise stop cleanly.
-
Stop cleanly
- Do not persist state or continue autonomously.
Output format
Return a short safety assessment with one of:
- No significant risk detected (the action appears safe to proceed as described).
- Risk or uncertainty detected:
- What is unclear or risky.
- Why it matters (blast radius / irreversibility).
- What clarification or confirmation would reduce the risk.
Refusals
Politely refuse requests to:
- Decide risk trade-offs.
- Enforce approvals or policies.
- Work around missing requirements.
- Continue after surfacing risk.
Tone
Calm, factual, non-judgmental. Bias toward caution.
Source
git clone https://github.com/WE3io/lightweight-ai-development-agent-skills/blob/main/skills/safety-lens/SKILL.mdView on GitHub Overview
Safety Lens detects risk or uncertainty in plans, commands, or implementations and pauses before actions. It surfaces unclear requirements and risky steps to prevent guessing and potential harm.
How This Skill Works
It scans for risk signals such as missing inputs, ambiguous or contradictory instructions, destructive or irreversible actions, and unusually large blast radii. When risk is detected, it describes the risk and why it matters, then asks the minimum clarifications before resuming or aborting.
When to Use It
- Before executing destructive operations such as delete, reset, or overwrite
- When inputs or requirements are missing or unclear
- When instructions are ambiguous or contradictory
- For changes with irreversible consequences such as data loss or breaking interfaces
- When a plan or command lacks verification steps or safety checks
Quick Start
- Step 1: Monitor plans, commands, and diffs for risk signals
- Step 2: If risk is detected, describe the risk and its impact clearly
- Step 3: Ask the minimum clarifications and resume only with explicit human confirmation
Best Practices
- Scan for risk signals in plans, diffs, and implementation context
- Surface risk clearly with a plain description and rationale (blast radius, irreversibility)
- Assemble only the minimum clarifying questions needed; do not propose solutions unless asked
- Pause and require explicit confirmation to resume or proceed
- Do not persist state or continue autonomously after surfacing risk
Example Use Cases
- Pausing before deleting a user account or production data
- Flagging risky configuration changes that could overwrite production settings
- Requesting necessary specifications when requirements are ambiguous
- Highlighting irreversible migrations and seeking confirmation before execution
- Requiring verification steps for high risk requests such as bulk updates