Which engines can I choose for this skill?

Prefect+Dask for local/distributed pipelines, Nextflow for HPC, or Hybrid combinations.

How do I validate the scaffold?

Run a tiny end-to-end test and verify resume/retry behavior; ensure the resource plan fits cluster limits.

What does a runnable scaffold include?

Files, commands, data layout, resource plan per step, and container/environment assumptions to enable immediate execution.

bio-prefect-dask-nextflow

npx machina-cli add skill fmschulz/omics-skills/bio-prefect-dask-nextflow --openclaw

Files (1)

SKILL.md

1.6 KB

Bio Prefect + Dask + Nextflow

Choose and scaffold the right workflow engine for local, distributed, or HPC bioinformatics pipelines.

Instructions

Collect requirements (scheduler, container policy, data location, scale).
Choose engine: Prefect+Dask, Nextflow, or Hybrid.
Generate a runnable scaffold with clear data layout and resources.
Validate with a small test and resume/retry checks.

Quick Reference

Task	Action
Engine choice	See `decision-matrix.md`
Prefect+Dask scaffold	See `prefect-dask.md`
Prefect on Slurm	See `prefect-hpc-slurm.md`
Nextflow on HPC	See `nextflow-hpc.md`
Examples	See `examples.md`

Input Requirements

Workflow requirements and steps
Target environment (local, cluster, cloud)
Scheduler and container constraints
Data locations and expected volumes

Output

Engine recommendation with rationale
Runnable scaffold (files + commands)
Resource plan per step
Validation plan and checkpoints

Quality Gates

Tiny test run completes end-to-end
Resume/retry behavior verified
Resource plan matches cluster limits

Examples

Example 1: Engine recommendation

Choice: Nextflow
Why: CLI-heavy pipeline, HPC scheduler required, reproducible cache/resume needed.

Troubleshooting

Issue: Workflow fails on HPC due to environment mismatch Solution: Pin container/conda versions and validate with a minimal test dataset.

Source

git clone https://github.com/fmschulz/omics-skills/blob/main/skills/bio-prefect-dask-nextflow/SKILL.mdView on GitHub

Overview

This skill helps you design and scaffold bioinformatics workflows by selecting the right engine for your environment. It guides requirements gathering, generates runnable scaffolds with clear data layouts, and validates end-to-end with test runs. It covers local/distributed execution using Prefect+Dask and HPC-oriented Nextflow.

How This Skill Works

Start by collecting requirements (scheduler, container policy, data location, scale), then choose Prefect+Dask, Nextflow, or Hybrid, and generate a runnable scaffold with explicit files, data layout, and resource specs. Finally perform a small test and implement resume/retry checks to ensure reliability.

When to Use It

Setting up a local workstation or small cluster for development
Distributing workloads across nodes in a shared file system
HPC environments where Slurm, PBS, or similar schedulers are used
Need for reproducible cache and resume/retry behavior
Hybrid scenarios where you mix Prefect+Dask for orchestration with Nextflow for HPC

Quick Start

Step 1: Collect requirements (scheduler, container policy, data location, scale)
Step 2: Choose engine: Prefect+Dask, Nextflow, or Hybrid
Step 3: Generate runnable scaffold, run a tiny test, and verify resume/retry

Best Practices

Collect requirements before scaffolding (scheduler, container policy, data location, scale)
Choose the engine based on environment: Prefect+Dask for local/distributed, Nextflow for HPC, or Hybrid
Produce a runnable scaffold with clear data layout and per-step resources
Define a validation plan with a tiny end-to-end test and resume/retry checks
Pin container/conda versions and validate environment compatibility

Example Use Cases

Example: Engine recommendation selecting Nextflow for HPC with reproducible cache and resume
Example: Prefect+Dask scaffold designed for local development or a small cluster
Example: Prefect on Slurm to orchestrate tasks on an HPC scheduler
Example: Nextflow on HPC workflow demonstrating resource-aware scheduling
Example: Hybrid workflow blending Prefect+Dask orchestration with Nextflow HPC components

Frequently Asked Questions

Add this skill to your agents