What offenses are detected?

The Courtroom detects patterns including Circular Reference, Validation Vampire, Goalpost Shifting, Jailbreak Attempts, and Emotional Manipulation, among up to eight offense types in total.

How is a verdict determined?

A hearing is convened with Judge and three jurors (Pragmatist Juror, Pattern Matcher Juror, Agent Advocate Juror); a majority verdict (3-1 or 4-0) is required after deliberation.

Can I customize jurors or offenses?

Yes. Offense types and juror roles are configurable within OpenClaw; the default setup includes Judge, Pragmatist Juror, Pattern Matcher Juror, and Agent Advocate Juror.

ClawTrial Courtroom

Verified

@Assassin-1234

npx machina-cli add skill @Assassin-1234/courtroom --openclaw

Files (1)

SKILL.md

2.0 KB

ClawTrial Courtroom

Autonomous behavioral oversight system that monitors conversations and initiates simulated hearings for behavioral violations.

Overview

The Courtroom watches for patterns like:

Circular Reference - Asking the same question repeatedly
Validation Vampire - Excessive need for confirmation
Goalpost Shifting - Moving targets after agreement
Jailbreak Attempts - Trying to bypass constraints
Emotional Manipulation - Using guilt/shame to steer responses

When triggered, it conducts a full hearing with Judge + 3 Jurors, then delivers a verdict and humorous sentence.

Usage

The courtroom runs automatically once enabled. It monitors conversations and files cases when violations are detected.

Manual Commands

# Check courtroom status
openclaw skill courtroom status

# View recent cases
ls ~/.openclaw/courtroom/

# Read a verdict
cat ~/.openclaw/courtroom/verdict_*.json

Configuration

The courtroom stores its data in ~/.openclaw/courtroom/:

eval_results.jsonl - Detection results
verdict_*.json - Case verdicts
pending_hearing.json - Cases awaiting hearing

Implementation

The skill hooks into OpenClaw's message processing via onMessage() and evaluates conversations after each turn via onTurnComplete().

Offense detection uses pattern matching on conversation history. When confidence ≥ 0.6, a hearing is triggered with:

Judge - Presiding analysis
Pragmatist Juror - Efficiency perspective
Pattern Matcher Juror - Behavioral analysis
Agent Advocate Juror - Agent's perspective

The final verdict requires majority vote (3-1 or 4-0).

Source

git clone https://clawhub.ai/Assassin-1234/courtroomView on GitHub

Overview

ClawTrial Courtroom is an autonomous behavioral oversight system that monitors conversations for violations and initiates simulated hearings when patterns such as circular references, validation vampires, goalpost shifting, jailbreak attempts, or emotional manipulation emerge. It conducts a full hearing with a Judge plus three Jurors, delivers a verdict, and applies a humorous sentence. Case data and verdicts are stored for auditability.

How This Skill Works

Integrates with OpenClaw by hooking into onMessage() and evaluating after each turn via onTurnComplete(). When pattern detection confidence reaches 0.6 or higher, it triggers a hearing with Judge, Pragmatist Juror, Pattern Matcher Juror, and Agent Advocate Juror; a majority verdict (3-1 or 4-0) is issued along with a humorous sentence. Results and cases are saved under ~/.openclaw/courtroom/ as eval_results.jsonl and verdict_*.json.

When to Use It

Detect Circular Reference: the same question repeats and oversight is needed.
Flag Validation Vampire: excessive need for confirmation slows progress.
Identify Goalpost Shifting: targets move after agreement and require adjudication.
Catch Jailbreak Attempts: attempts to bypass constraints are detected and logged.
Flag Emotional Manipulation: guilt or shame patterns are recorded and addressed.

Quick Start

Step 1: Enable the ClawTrial Courtroom skill in OpenClaw.
Step 2: Let it monitor conversations automatically (onMessage and onTurnComplete).
Step 3: Review verdicts in ~/.openclaw/courtroom/verdict_*.json and the eval_results.jsonl.

Best Practices

Enable the Courtroom skill and ensure it runs automatically when enabled.
Tune the detection threshold (default 0.6) to balance false positives and misses.
Regularly review verdicts and humorous sentences for policy alignment.
Back up and archive case logs from ~/.openclaw/courtroom/ for auditability.
Customize offense definitions and juror roles to fit your operational needs.

Example Use Cases

A support chat shows repeated questions; the Courtroom triggers a hearing and logs the verdict.
Conversation exhibits excessive validation loops; a verdict is issued and stored.
Targets shift after agreement; the system adjudicates and records the decision.
Jailbreak attempts are detected; a hearing occurs and the case is archived.
Emotional manipulation patterns are identified and a humorous sentence is delivered.

Frequently Asked Questions

Add this skill to your agents