What is simulation-validator?

A three-stage validation protocol for materials simulations that runs preflight checks, monitors runtime behavior, and validates post-run results.

What outputs are produced?

Each script emits structured JSON fields such as report.*, alerts, residual_stats, dt_stats, checks, and failed_checks to guide decisions.

How do I integrate it into a pipeline?

Run the preflight, then the runtime monitor during execution, and finally the result validator after completion; if needed, run the failure diagnoser to identify fixes and iterate.

simulation-validator

npx machina-cli add skill HeshamFS/materials-simulation-skills/simulation-validator --openclaw

Files (1)

SKILL.md

6.4 KB

Simulation Validator

Goal

Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.

Requirements

Python 3.8+
No external dependencies (uses Python standard library only)
Works on Linux, macOS, and Windows

Inputs to Gather

Before running validation scripts, collect from the user:

Input	Description	Example
Config file	Simulation configuration (JSON/YAML)	`simulation.json`
Log file	Runtime output log	`simulation.log`
Metrics file	Post-run metrics (JSON)	`results.json`
Required params	Parameters that must exist	`dt,dx,kappa`
Valid ranges	Parameter bounds	`dt:1e-6:1e-2`

Decision Guidance

When to Run Each Stage

Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│         └── BLOCK status? → Fix issues, do NOT run simulation
│         └── WARN status? → Review warnings, document if accepted
│         └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│         └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│         └── Failed checks? → Do NOT use results
│                            → Run failure_diagnoser.py
│         └── All passed? → Results are valid

Choosing Validation Thresholds

Metric	Conservative	Standard	Relaxed
Mass tolerance	1e-6	1e-3	1e-2
Residual growth	2x	10x	100x
dt reduction	10x	100x	1000x

Script Outputs (JSON Fields)

Script	Output Fields
`scripts/preflight_checker.py`	`report.status`, `report.blockers`, `report.warnings`
`scripts/runtime_monitor.py`	`alerts`, `residual_stats`, `dt_stats`
`scripts/result_validator.py`	`checks`, `confidence_score`, `failed_checks`
`scripts/failure_diagnoser.py`	`probable_causes`, `recommended_fixes`

Three-Stage Validation Protocol

Stage 1: Pre-flight (Before Simulation)

Run scripts/preflight_checker.py --config simulation.json
BLOCK status: Stop immediately, fix all blocker issues
WARN status: Review warnings, document accepted risks
PASS status: Proceed to simulation

python3 scripts/preflight_checker.py \
    --config simulation.json \
    --required dt,dx,kappa \
    --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
    --min-free-gb 1.0 \
    --json

Stage 2: Runtime (During Simulation)

Run scripts/runtime_monitor.py --log simulation.log periodically
Configure alert thresholds based on problem type
Stop simulation if critical alerts appear

python3 scripts/runtime_monitor.py \
    --log simulation.log \
    --residual-growth 10.0 \
    --dt-drop 100.0 \
    --json

Stage 3: Post-flight (After Simulation)

Run scripts/result_validator.py --metrics results.json
All checks PASS: Results are valid for analysis
Any check FAIL: Do NOT use results, diagnose failure

python3 scripts/result_validator.py \
    --metrics results.json \
    --bound-min 0.0 \
    --bound-max 1.0 \
    --mass-tol 1e-3 \
    --json

Failure Diagnosis

When validation fails:

python3 scripts/failure_diagnoser.py --log simulation.log --json

Conversational Workflow Example

User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?

Agent workflow:

First, check the log for obvious errors:

python3 scripts/failure_diagnoser.py --log simulation.log --json

If diagnosis suggests numerical blow-up, check runtime stats:

python3 scripts/runtime_monitor.py --log simulation.log --json

Recommend fixes based on findings:
- If residual grew rapidly → reduce time step
- If dt collapsed → check stability conditions
- If NaN detected → check initial conditions

Error Handling

Error	Cause	Resolution
`Config not found`	File path invalid	Verify config path exists
`Non-numeric value`	Parameter is not a number	Fix config file format
`out of range`	Parameter outside bounds	Adjust parameter or bounds
`Output directory not writable`	Permission issue	Check directory permissions
`Insufficient disk space`	Disk nearly full	Free up space or reduce output

Interpretation Guidance

Status Meanings

Status	Meaning	Action
PASS	All checks passed	Proceed with confidence
WARN	Non-critical issues found	Review and document
BLOCK	Critical issues found	Must fix before proceeding

Confidence Score Interpretation

Score	Meaning
1.0	All validation checks passed
0.75+	Most checks passed, minor issues
0.5-0.75	Significant issues, review carefully
< 0.5	Major problems, do not trust results

Common Failure Patterns

Pattern in Log	Likely Cause	Recommended Fix
NaN, Inf, overflow	Numerical instability	Reduce dt, increase damping
max iterations, did not converge	Solver failure	Tune preconditioner, tolerances
out of memory	Memory exhaustion	Reduce mesh, enable out-of-core
dt reduced	Adaptive stepping triggered	May be okay if controlled

Limitations

Not a real-time monitor: Scripts analyze logs after-the-fact
Regex-based: Log parsing depends on pattern matching; may miss unusual formats
No automatic fixes: Scripts diagnose but don't modify simulations

References

references/validation_protocol.md - Detailed checklist and criteria
references/log_patterns.md - Common failure signatures and regex patterns

Version History

v1.1.0 (2024-12-24): Enhanced documentation, decision guidance, Windows compatibility
v1.0.0: Initial release with 4 validation scripts

Source

git clone https://github.com/HeshamFS/materials-simulation-skills/blob/main/skills/simulation-workflow/simulation-validator/SKILL.md

View on GitHub

Overview

The simulation-validator provides a three-stage protocol—pre-flight checks, runtime monitoring, and post-flight validation—to ensure simulations run correctly and results are trustworthy. It helps detect NaN/Inf, verify mass and energy conservation, diagnose failures, and guide fixes.

How This Skill Works

It runs on Python 3.8+ using only the standard library. The workflow uses preflight_checker.py, runtime_monitor.py, result_validator.py, and failure_diagnoser.py to collect inputs, monitor execution, validate outcomes, and diagnose issues, producing structured JSON outputs for each stage.

When to Use It

Before starting a simulation to catch blockers with preflight_checker.py
During simulation to monitor thresholds and trigger alerts with runtime_monitor.py
After completion to validate results with result_validator.py
If checks fail, run failure_diagnoser.py to identify probable causes and fixes
Periodically or for complex problems to ensure convergence and conservation (mass/energy)

Quick Start

Step 1: python3 scripts/preflight_checker.py --config simulation.json --required dt,dx,kappa --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" --min-free-gb 1.0 --json
Step 2: python3 scripts/runtime_monitor.py --log simulation.log --residual-growth 10.0 --dt-drop 100.0 --json
Step 3: python3 scripts/result_validator.py --metrics results.json --bound-min 0.0 --bound-max 1.0 --mass-tol 1e-3 --json

Best Practices

Define required inputs and valid ranges in the config; run preflight with explicit --required and --ranges
Treat BLOCK status as blockers to fix before proceeding; WARN status as accepted risks with documentation
Tune thresholds (Conservative, Standard, Relaxed) to match problem sensitivity and run reliability
Use the JSON outputs (report, alerts, residual_stats, checks, etc.) to drive debugging and improvements
Automate the three-script workflow in CI to validate each simulation run

Example Use Cases

Preflight detects invalid dt/dx ranges and halts the run before allocation occurs
Runtime monitor triggers an alert for excessive residual growth, prompting parameter adjustment
Post-run validation flags mass conservation violation and stops downstream analysis
Converged simulations with residuals within thresholds are accepted as valid results
Failure diagnosis identifies NaN/Inf root causes from log analysis and suggests fixes

Frequently Asked Questions

Add this skill to your agents