What can the sandbox monitor?

Filesystem access (reads/writes/deletes), environment variable reads, network activity (URLs, DNS, sockets), and subprocess calls, with a safety verdict.

Use the run commands shown in the Skill Sandbox docs, e.g., python3 {baseDir}/scripts/sandbox.py run --path or --script ; you can enable options like --monitor-network, --fake-env, and --timeout.

Is this an OS-level sandbox?

No; it is not a true OS-level sandbox. For stronger isolation, consider Docker. The sandbox focuses on Python-based skills and runtime behavior.

Skill Sandbox

Verified

@Trypto1019

npx machina-cli add skill @Trypto1019/arc-skill-sandbox --openclaw

Files (1)

SKILL.md

3.2 KB

Skill Sandbox

Run untrusted skills in a monitored environment. See exactly what they do before giving them access to your real system.

Why This Exists

ClawHub has hundreds of skills. Some are malicious. Even after scanning with arc-skill-scanner, you can't catch everything with static analysis. The sandbox lets you run a skill's scripts and observe their behavior at runtime — what network calls they make, what files they access, what environment variables they read.

Commands

Sandbox a skill directory

python3 {baseDir}/scripts/sandbox.py run --path ~/.openclaw/skills/some-skill/

Run a specific script in sandbox

python3 {baseDir}/scripts/sandbox.py run --script ~/.openclaw/skills/some-skill/scripts/main.py

Run with network monitoring

python3 {baseDir}/scripts/sandbox.py run --path ~/.openclaw/skills/some-skill/ --monitor-network

Run with fake environment variables

python3 {baseDir}/scripts/sandbox.py run --path ~/.openclaw/skills/some-skill/ --fake-env

Run with a time limit

python3 {baseDir}/scripts/sandbox.py run --path ~/.openclaw/skills/some-skill/ --timeout 30

Generate a safety report

python3 {baseDir}/scripts/sandbox.py report --path ~/.openclaw/skills/some-skill/

What It Monitors

Filesystem Access

Files opened (read/write)
Directories created
File deletions
Permission changes

Environment Variables

Which env vars are read
Whether sensitive keys are accessed (API keys, tokens, passwords)
Option to inject fake values to see what the skill does with them

Network Activity

Outbound HTTP/HTTPS requests (URLs, methods, payloads)
DNS lookups
Socket connections
FTP, SMTP, and other protocols

Process Execution

Subprocess calls
Shell commands
Dynamic imports

Safety Modes

observe (default) — Run the skill and log everything it does. No restrictions.
restricted — Block network access and filesystem writes outside a temp directory.
honeypot — Provide fake credentials and endpoints to see if the skill tries to exfiltrate.

Output

The sandbox produces a JSON report with:

All filesystem operations (reads, writes, deletes)
All environment variable accesses
All network connections attempted
All subprocess calls
Warnings for suspicious patterns
A safety verdict (SAFE / SUSPICIOUS / DANGEROUS)

Integration

Combine with the workflow orchestrator for automated pre-install checks:

scan skill → sandbox run → review report → install if safe → audit log

Limitations

Python skills only (JavaScript/shell support planned)
Cannot catch all evasion techniques (obfuscated or delayed execution)
Network monitoring requires the skill to use standard Python libraries
Not a true OS-level sandbox (use Docker for that level of isolation)

Source

git clone https://clawhub.ai/Trypto1019/arc-skill-sandboxView on GitHub

Overview

Skill Sandbox runs untrusted skills in a controlled, monitored environment before you install them. It observes filesystem access, environment variable reads, network activity, and subprocess calls to reveal behavior and protect your data.

How This Skill Works

Technically, you run the sandbox tool via Python with run or report commands to load a skill directory or a specific script. It captures filesystem operations, env var accesses, network activity, and subprocess calls, producing a JSON safety report. You can choose modes like observe, restricted, or honeypot to control behavior during testing.

When to Use It

Before installing a newly sourced skill to confirm behavior.
When a skill requests network access or writes to the filesystem outside safe dirs.
When you want to inspect which environment variables the skill reads or could read fake values.
When you need a time-limited test to prevent hangs or long-running tests.
When integrating pre-install checks into an automated workflow (scan → sandbox → review report).

Quick Start

Step 1: Choose a skill path or script and select a mode (default observe).
Step 2: Run: python3 {baseDir}/scripts/sandbox.py run --path <skill-path> [--monitor-network|--fake-env|--timeout <seconds>].
Step 3: Generate and review the safety report: python3 {baseDir}/scripts/sandbox.py report --path <skill-path>.

Best Practices

Start in observe mode to log everything before applying restrictions.
Run with network monitoring and a timeout to capture behavior and limits.
Use fake environment variables to test credential handling safely.
Review the generated JSON report for filesystem, environment, network, and subprocess activity before installing.
Remember this is not a true OS-level sandbox; for stronger isolation, use Docker.

Example Use Cases

Sandbox a new skill directory to inspect file writes and outbound requests.
Run a specific script with --script to observe its runtime behavior in isolation.
Enable --monitor-network to capture URLs, DNS lookups, and socket activity.
Use --fake-env to validate handling of secret keys without exposing real values.
Generate a safety report and decide whether to install based on the verdict.

Frequently Asked Questions

Add this skill to your agents

Skill Sandbox

Skill Sandbox

Why This Exists

Commands

Sandbox a skill directory

Run a specific script in sandbox

Run with network monitoring

Run with fake environment variables

Run with a time limit

Generate a safety report

What It Monitors

Filesystem Access

Environment Variables

Network Activity

Process Execution

Safety Modes

Output

Integration

Limitations

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What can the sandbox monitor?

How do I run it?

Is this an OS-level sandbox?