openskills-e2e-test-runbook
npx machina-cli add skill Geeksfino/openskills/openskills-e2e-test-runbook --openclawOpenSkills E2E Test Runbook
Use this skill for confidence checks before merging runtime, tooling, or example-agent changes.
Test Layers
- Runtime regression tests
- Sandbox-focused tests
- Example-agent behavior checks
- Optional binding smoke tests
Baseline Commands
cargo test -p openskills-runtime
Example-agent checks (from example directories):
npm install
npm start "What skills are available?"
npm start "Create a new skill called 'note-taker'."
E2E Expectations
- Skills discover successfully.
- Skill activation occurs for matching prompts.
- Tool calls align with intent (activation, file reads/writes, script runs).
- No unexpected sandbox failures.
Reporting Format
- What was run
- What passed
- What failed
- Repro command for each failure
- Suggested next fix
Source
git clone https://github.com/Geeksfino/openskills/blob/main/.cursor/skills/openskills-e2e-test-runbook/SKILL.mdView on GitHub Overview
Performs a deterministic end-to-end OpenSkills validation across runtime tests and example agents to surface regressions before merges. It verifies discovery, activation behavior, and tool-call accuracy, then reports outcomes, repro steps, and suggested fixes.
How This Skill Works
The runbook executes four test layers: 1) Runtime regression tests, 2) Sandbox-focused tests, 3) Example-agent behavior checks, and 4) Optional binding smoke tests. Baseline commands include cargo test -p openskills-runtime and example-agent checks via npm install and npm start checks. Results are consolidated into a standard report detailing what was run, what passed, what failed, repro commands, and suggested next fixes.
When to Use It
- Before merging runtime, tooling, or example-agent changes.
- During CI to catch regressions early.
- To validate example-agent prompts and corresponding activation behavior.
- To verify tool calls align with intent (activation, file reads/writes, scripts).
- To investigate sandbox or runtime failures and isolate regressions.
Quick Start
- Step 1: Run baseline runtime tests: cargo test -p openskills-runtime.
- Step 2: Run example-agent checks: npm install; npm start "What skills are available?"; npm start "Create a new skill named note-taker."
- Step 3: Review the end-to-end report and reproduce any failures with the provided repro commands.
Best Practices
- Run all test layers locally or in CI to ensure coverage.
- Capture detailed repro commands for any failure.
- Verify that skills are discovered and activate only for matching prompts.
- Inspect tool call sequences to confirm intended actions (activation, file I/O, scripts).
- Review sandbox stability and binding smoke tests for regressions.
Example Use Cases
- Running cargo test -p openskills-runtime to validate runtime changes.
- Using npm install and npm start with prompts like What skills are available?.
- Checking activation behavior for a given prompt to ensure correct skill triggering.
- Inspecting tool-call logs to confirm intended intents (activation, reads, writes, scripts).
- Documenting failures with a repro command and suggested fix in the Reporting Format.