Get the FREE Ultimate OpenClaw Setup Guide →

simulation-validator

npx machina-cli add skill HeshamFS/materials-simulation-skills/simulation-validator --openclaw
Files (1)
SKILL.md
6.4 KB

Simulation Validator

Goal

Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.

Requirements

  • Python 3.8+
  • No external dependencies (uses Python standard library only)
  • Works on Linux, macOS, and Windows

Inputs to Gather

Before running validation scripts, collect from the user:

InputDescriptionExample
Config fileSimulation configuration (JSON/YAML)simulation.json
Log fileRuntime output logsimulation.log
Metrics filePost-run metrics (JSON)results.json
Required paramsParameters that must existdt,dx,kappa
Valid rangesParameter boundsdt:1e-6:1e-2

Decision Guidance

When to Run Each Stage

Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│         └── BLOCK status? → Fix issues, do NOT run simulation
│         └── WARN status? → Review warnings, document if accepted
│         └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│         └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│         └── Failed checks? → Do NOT use results
│                            → Run failure_diagnoser.py
│         └── All passed? → Results are valid

Choosing Validation Thresholds

MetricConservativeStandardRelaxed
Mass tolerance1e-61e-31e-2
Residual growth2x10x100x
dt reduction10x100x1000x

Script Outputs (JSON Fields)

ScriptOutput Fields
scripts/preflight_checker.pyreport.status, report.blockers, report.warnings
scripts/runtime_monitor.pyalerts, residual_stats, dt_stats
scripts/result_validator.pychecks, confidence_score, failed_checks
scripts/failure_diagnoser.pyprobable_causes, recommended_fixes

Three-Stage Validation Protocol

Stage 1: Pre-flight (Before Simulation)

  1. Run scripts/preflight_checker.py --config simulation.json
  2. BLOCK status: Stop immediately, fix all blocker issues
  3. WARN status: Review warnings, document accepted risks
  4. PASS status: Proceed to simulation
python3 scripts/preflight_checker.py \
    --config simulation.json \
    --required dt,dx,kappa \
    --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
    --min-free-gb 1.0 \
    --json

Stage 2: Runtime (During Simulation)

  1. Run scripts/runtime_monitor.py --log simulation.log periodically
  2. Configure alert thresholds based on problem type
  3. Stop simulation if critical alerts appear
python3 scripts/runtime_monitor.py \
    --log simulation.log \
    --residual-growth 10.0 \
    --dt-drop 100.0 \
    --json

Stage 3: Post-flight (After Simulation)

  1. Run scripts/result_validator.py --metrics results.json
  2. All checks PASS: Results are valid for analysis
  3. Any check FAIL: Do NOT use results, diagnose failure
python3 scripts/result_validator.py \
    --metrics results.json \
    --bound-min 0.0 \
    --bound-max 1.0 \
    --mass-tol 1e-3 \
    --json

Failure Diagnosis

When validation fails:

python3 scripts/failure_diagnoser.py --log simulation.log --json

Conversational Workflow Example

User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?

Agent workflow:

  1. First, check the log for obvious errors:
    python3 scripts/failure_diagnoser.py --log simulation.log --json
    
  2. If diagnosis suggests numerical blow-up, check runtime stats:
    python3 scripts/runtime_monitor.py --log simulation.log --json
    
  3. Recommend fixes based on findings:
    • If residual grew rapidly → reduce time step
    • If dt collapsed → check stability conditions
    • If NaN detected → check initial conditions

Error Handling

ErrorCauseResolution
Config not foundFile path invalidVerify config path exists
Non-numeric valueParameter is not a numberFix config file format
out of rangeParameter outside boundsAdjust parameter or bounds
Output directory not writablePermission issueCheck directory permissions
Insufficient disk spaceDisk nearly fullFree up space or reduce output

Interpretation Guidance

Status Meanings

StatusMeaningAction
PASSAll checks passedProceed with confidence
WARNNon-critical issues foundReview and document
BLOCKCritical issues foundMust fix before proceeding

Confidence Score Interpretation

ScoreMeaning
1.0All validation checks passed
0.75+Most checks passed, minor issues
0.5-0.75Significant issues, review carefully
< 0.5Major problems, do not trust results

Common Failure Patterns

Pattern in LogLikely CauseRecommended Fix
NaN, Inf, overflowNumerical instabilityReduce dt, increase damping
max iterations, did not convergeSolver failureTune preconditioner, tolerances
out of memoryMemory exhaustionReduce mesh, enable out-of-core
dt reducedAdaptive stepping triggeredMay be okay if controlled

Limitations

  • Not a real-time monitor: Scripts analyze logs after-the-fact
  • Regex-based: Log parsing depends on pattern matching; may miss unusual formats
  • No automatic fixes: Scripts diagnose but don't modify simulations

References

  • references/validation_protocol.md - Detailed checklist and criteria
  • references/log_patterns.md - Common failure signatures and regex patterns

Version History

  • v1.1.0 (2024-12-24): Enhanced documentation, decision guidance, Windows compatibility
  • v1.0.0: Initial release with 4 validation scripts

Source

git clone https://github.com/HeshamFS/materials-simulation-skills/blob/main/skills/simulation-workflow/simulation-validator/SKILL.mdView on GitHub

Overview

The simulation-validator provides a three-stage protocol—pre-flight checks, runtime monitoring, and post-flight validation—to ensure simulations run correctly and results are trustworthy. It helps detect NaN/Inf, verify mass and energy conservation, diagnose failures, and guide fixes.

How This Skill Works

It runs on Python 3.8+ using only the standard library. The workflow uses preflight_checker.py, runtime_monitor.py, result_validator.py, and failure_diagnoser.py to collect inputs, monitor execution, validate outcomes, and diagnose issues, producing structured JSON outputs for each stage.

When to Use It

  • Before starting a simulation to catch blockers with preflight_checker.py
  • During simulation to monitor thresholds and trigger alerts with runtime_monitor.py
  • After completion to validate results with result_validator.py
  • If checks fail, run failure_diagnoser.py to identify probable causes and fixes
  • Periodically or for complex problems to ensure convergence and conservation (mass/energy)

Quick Start

  1. Step 1: python3 scripts/preflight_checker.py --config simulation.json --required dt,dx,kappa --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" --min-free-gb 1.0 --json
  2. Step 2: python3 scripts/runtime_monitor.py --log simulation.log --residual-growth 10.0 --dt-drop 100.0 --json
  3. Step 3: python3 scripts/result_validator.py --metrics results.json --bound-min 0.0 --bound-max 1.0 --mass-tol 1e-3 --json

Best Practices

  • Define required inputs and valid ranges in the config; run preflight with explicit --required and --ranges
  • Treat BLOCK status as blockers to fix before proceeding; WARN status as accepted risks with documentation
  • Tune thresholds (Conservative, Standard, Relaxed) to match problem sensitivity and run reliability
  • Use the JSON outputs (report, alerts, residual_stats, checks, etc.) to drive debugging and improvements
  • Automate the three-script workflow in CI to validate each simulation run

Example Use Cases

  • Preflight detects invalid dt/dx ranges and halts the run before allocation occurs
  • Runtime monitor triggers an alert for excessive residual growth, prompting parameter adjustment
  • Post-run validation flags mass conservation violation and stops downstream analysis
  • Converged simulations with residuals within thresholds are accepted as valid results
  • Failure diagnosis identifies NaN/Inf root causes from log analysis and suggests fixes

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers