Get the FREE Ultimate OpenClaw Setup Guide →

run-pipeline

npx machina-cli add skill xvirobotics/metaskill/run-pipeline --openclaw
Files (1)
SKILL.md
4.4 KB

You are executing the full data science pipeline for this project. Run each stage sequentially, verifying success before proceeding to the next stage. Stop immediately if any stage fails and report the error clearly.

Dynamic Context

Current branch: !git branch --show-current Data directory contents: !ls data/ 2>/dev/null || echo "No data/ directory found" Available configs: !ls configs/*.yaml 2>/dev/null || ls configs/*.toml 2>/dev/null || echo "No config files found" Python environment: !which python3 && python3 --version 2>/dev/null || echo "Python not found" Recent changes: !git diff --stat HEAD~3 2>/dev/null || echo "No recent commits"

Configuration

If the user provided a config file as an argument, use it: $ARGUMENTS Otherwise, look for the default config at configs/experiment.yaml or configs/experiment.toml.

Pipeline Stages

Execute each stage in order. After each stage, check for errors and verify outputs exist before proceeding.

Stage 1: Environment Check

Verify the Python environment is ready:

python3 -c "import torch; import pandas; import numpy; print(f'PyTorch {torch.__version__}, pandas {pandas.__version__}, NumPy {numpy.__version__}')"

If imports fail, report which packages are missing and suggest pip install -r requirements.txt.

Stage 2: Data Validation

Run data validation on the raw data:

python3 -m src.data.validate --data-dir data/raw/

If the validation script does not exist, look for alternative patterns:

  • python3 src/data/validate.py
  • python3 -m pytest tests/test_data/ -v --tb=short
  • Check for pandera schemas in src/data/ and report their status

Verify: validation passes with no critical errors. Log any warnings.

Stage 3: Preprocessing

Run the preprocessing pipeline:

python3 -m src.data.preprocess --config $CONFIG_FILE

Alternative patterns:

  • python3 src/data/preprocess.py --config $CONFIG_FILE
  • dvc repro preprocess (if DVC pipeline is configured)

Verify: processed data files exist in data/processed/ (check for .parquet or .csv files).

Stage 4: Feature Engineering

Run feature engineering:

python3 -m src.features.build_features --config $CONFIG_FILE

Alternative patterns:

  • python3 src/features/build_features.py
  • dvc repro features

Verify: feature files exist in data/features/ with expected columns.

Stage 5: Model Training

Run model training:

python3 -m src.models.training.trainer --config $CONFIG_FILE

Alternative patterns:

  • python3 src/models/train.py --config $CONFIG_FILE
  • python3 train.py --config $CONFIG_FILE

Monitor output for:

  • Loss values (should decrease over epochs)
  • Validation metrics at each epoch
  • Any NaN or Inf values (indicates numerical instability)
  • Out-of-memory errors

Verify: model checkpoint exists in checkpoints/ directory.

Stage 6: Evaluation

Run model evaluation on the test set:

python3 -m src.models.evaluation.evaluate --checkpoint checkpoints/best_model.pt --config $CONFIG_FILE

Alternative patterns:

  • python3 src/evaluation/evaluate.py
  • python3 evaluate.py --checkpoint checkpoints/best_model.pt

Verify: metrics JSON file exists in reports/ or experiments/.

Stage 7: Summary

After all stages complete, produce a summary:

  1. Report which stages succeeded and which failed
  2. Print the final evaluation metrics (read from the metrics JSON)
  3. List all generated artifacts (checkpoints, processed data, feature files, metrics)
  4. If any stage failed, provide the error message and suggest a fix
  5. Report total pipeline execution time

Error Handling

  • If a stage fails, do NOT proceed to the next stage (except validation warnings which are non-blocking)
  • Capture stderr and stdout from each command
  • For Python errors, read the traceback and identify the root cause
  • For file-not-found errors, check if the expected directory structure exists
  • For import errors, report the missing package
  • For CUDA out-of-memory, suggest reducing batch size in the config

Source

git clone https://github.com/xvirobotics/metaskill/blob/main/examples/data-science/.claude/skills/run-pipeline/SKILL.mdView on GitHub

Overview

It runs the complete pipeline end-to-end—from validating raw data through preprocessing, feature engineering, training, and evaluation. It stops on any failure and reports clear errors, making it easy to re-run after data or code changes. Use a provided config file or the default configs to reproduce experiments.

How This Skill Works

The tool determines the config (argument or defaults), then executes each stage with Python modules (validate, preprocess, build_features, trainer, evaluate). After each stage it checks for required outputs (data/processed, data/features, checkpoints, reports) and aborts with a clear message if a stage fails.

When to Use It

  • You need to run the full end-to-end ML pipeline from raw data to evaluation.
  • You’ve updated data or code and want to re-run the entire workflow.
  • You want to reproduce an experiment using a specific config file.
  • You must validate data integrity before training to catch issues early.
  • You want a complete audit of artifacts (checkpoints, processed data, features, metrics) after a run.

Quick Start

  1. Step 1: Choose a config file path as an argument, or rely on configs/experiment.yaml.
  2. Step 2: Run the pipeline: run-pipeline [config-file].
  3. Step 3: Validate outputs in data/processed, data/features, checkpoints, and reports/experiments.

Best Practices

  • Pin and version-control the config file (configs/experiment.yaml) for reproducibility.
  • Use a clean Python environment and install dependencies via requirements.txt.
  • Verify outputs after each stage before proceeding to the next.
  • Watch for common failures: NaN/Inf, OOM, or missing outputs; address before continuing.
  • Store artifacts in stable locations (data/processed, data/features, checkpoints, reports) to simplify tracking.

Example Use Cases

  • CI/CD pipeline that automatically validates new data, retrains, and reports metrics.
  • Reproducing a paper experiment by rerunning with an updated feature set.
  • Hyperparameter exploration by sequentially running the pipeline for different configs.
  • Data drift remediation by validating new data, retraining, and reevaluating.
  • On-demand retraining after code changes or dependency updates.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers