What triggers safety-lens assessment?

Risk or uncertainty signals such as missing inputs, ambiguous or contradictory instructions, destructive operations (delete/overwrite/reset), irreversible changes, or a plan with a broad blast radius.

Can safety-lens propose changes or solutions?

No. It surfaces risk and asks for clarification; solutions are only offered if the user requests them.

What happens after risk is surfaced?

It awaits explicit human confirmation to resume; if confirmation is given, it proceeds without repeating the same signal. Otherwise, it stops and clears state.

safety-lens

npx machina-cli add skill JoaoVyttorFelix/lightweight-ai-development-agent-skills/safety-lens --openclaw

Files (1)

SKILL.md

1.8 KB

Safety Lens (Risk & Uncertainty Handler)

Overview

Surface risk or ambiguity early, pause action, and ask for clarification. Never guess.

Workflow

Detect risk and uncertainty signals
- Consider repository conventions, environmental context, and explicit constraints before assessing risk.
- Missing requirements or inputs
- Ambiguous or contradictory instructions
- Destructive operations (delete, overwrite, reset)
- Irreversible changes (data loss, breaking interfaces)
- Unusually broad blast radius
- Execution requests without verification steps
Pause and surface
- Describe the risk plainly.
- Explain why it matters (blast radius, irreversibility).
Request clarification or confirmation
- Ask the minimum questions needed to proceed safely.
- Do not propose solutions unless asked.
Resume or abort (human-driven)
- Resume only after explicit confirmation.
- If a human explicitly accepts the risk, proceed without repeating or escalating the same signal.
- Otherwise stop cleanly.
Stop cleanly
- Do not persist state or continue autonomously.

Output format

Return a short safety assessment with one of:

No significant risk detected (the action appears safe to proceed as described).
Risk or uncertainty detected:
- What is unclear or risky.
- Why it matters (blast radius / irreversibility).
- What clarification or confirmation would reduce the risk.

Refusals

Politely refuse requests to:

Decide risk trade-offs.
Enforce approvals or policies.
Work around missing requirements.
Continue after surfacing risk.

Tone

Calm, factual, non-judgmental. Bias toward caution.

Source

git clone https://github.com/JoaoVyttorFelix/lightweight-ai-development-agent-skills/blob/main/safety-lens/SKILL.mdView on GitHub

Overview

Safety Lens detects risk or ambiguity early, pauses actions, and asks for clarification. It helps prevent guesswork by surfacing missing requirements, ambiguous instructions, and potentially destructive or irreversible operations. Use before or after plans, commands, diffs, or implementations to surface unclear requirements or risky actions.

How This Skill Works

It analyzes signals from repository conventions, environment context, and explicit constraints to detect risk and uncertainty. On detection, it pauses and clearly describes the risk and its potential consequences. It then requests minimal clarifications and proceeds only after explicit human confirmation; it never persists state or continues autonomously.

When to Use It

Before executing plans, commands, diffs, or implementations that could delete, overwrite, reset, or cause data loss.
When inputs or requirements are missing, or when instructions are ambiguous or contradictory.
When a request has an unusually broad blast radius or irreversible consequences.
When verification steps or explicit approvals are not provided.
Before applying risky diffs or code changes in critical repositories.

Quick Start

Step 1: Detect risk signals in the incoming instruction or plan.
Step 2: Pause execution, surface the risk, and describe why it matters.
Step 3: Ask for minimal clarifications or confirmation; resume only after explicit human approval.

Best Practices

Scan for risk signals at decision points: missing inputs, ambiguity, destructive requests, irreversible changes.
Describe the risk plainly and explain why it matters (blast radius, irreversibility).
Ask the minimum clarifying questions needed to proceed safely.
Do not propose solutions or workarounds unless explicitly asked.
Resume only after explicit human confirmation; do not persist state or continue autonomously.

Example Use Cases

Flagging a plan to delete a production resource and pausing for confirmation.
Surface ambiguity in user instructions and request clarifications before acting.
Detecting irreversible data migrations and halting the process.
Warning about a deployment with a broad blast radius and requiring verification steps.
Blocking an interface-overwrite diff until an explicit approval is granted.

Frequently Asked Questions

Add this skill to your agents