Get the FREE Ultimate OpenClaw Setup Guide →

bio-prefect-dask-nextflow

npx machina-cli add skill fmschulz/omics-skills/bio-prefect-dask-nextflow --openclaw
Files (1)
SKILL.md
1.6 KB

Bio Prefect + Dask + Nextflow

Choose and scaffold the right workflow engine for local, distributed, or HPC bioinformatics pipelines.

Instructions

  1. Collect requirements (scheduler, container policy, data location, scale).
  2. Choose engine: Prefect+Dask, Nextflow, or Hybrid.
  3. Generate a runnable scaffold with clear data layout and resources.
  4. Validate with a small test and resume/retry checks.

Quick Reference

TaskAction
Engine choiceSee decision-matrix.md
Prefect+Dask scaffoldSee prefect-dask.md
Prefect on SlurmSee prefect-hpc-slurm.md
Nextflow on HPCSee nextflow-hpc.md
ExamplesSee examples.md

Input Requirements

  • Workflow requirements and steps
  • Target environment (local, cluster, cloud)
  • Scheduler and container constraints
  • Data locations and expected volumes

Output

  • Engine recommendation with rationale
  • Runnable scaffold (files + commands)
  • Resource plan per step
  • Validation plan and checkpoints

Quality Gates

  • Tiny test run completes end-to-end
  • Resume/retry behavior verified
  • Resource plan matches cluster limits

Examples

Example 1: Engine recommendation

Choice: Nextflow
Why: CLI-heavy pipeline, HPC scheduler required, reproducible cache/resume needed.

Troubleshooting

Issue: Workflow fails on HPC due to environment mismatch Solution: Pin container/conda versions and validate with a minimal test dataset.

Source

git clone https://github.com/fmschulz/omics-skills/blob/main/skills/bio-prefect-dask-nextflow/SKILL.mdView on GitHub

Overview

This skill helps you design and scaffold bioinformatics workflows by selecting the right engine for your environment. It guides requirements gathering, generates runnable scaffolds with clear data layouts, and validates end-to-end with test runs. It covers local/distributed execution using Prefect+Dask and HPC-oriented Nextflow.

How This Skill Works

Start by collecting requirements (scheduler, container policy, data location, scale), then choose Prefect+Dask, Nextflow, or Hybrid, and generate a runnable scaffold with explicit files, data layout, and resource specs. Finally perform a small test and implement resume/retry checks to ensure reliability.

When to Use It

  • Setting up a local workstation or small cluster for development
  • Distributing workloads across nodes in a shared file system
  • HPC environments where Slurm, PBS, or similar schedulers are used
  • Need for reproducible cache and resume/retry behavior
  • Hybrid scenarios where you mix Prefect+Dask for orchestration with Nextflow for HPC

Quick Start

  1. Step 1: Collect requirements (scheduler, container policy, data location, scale)
  2. Step 2: Choose engine: Prefect+Dask, Nextflow, or Hybrid
  3. Step 3: Generate runnable scaffold, run a tiny test, and verify resume/retry

Best Practices

  • Collect requirements before scaffolding (scheduler, container policy, data location, scale)
  • Choose the engine based on environment: Prefect+Dask for local/distributed, Nextflow for HPC, or Hybrid
  • Produce a runnable scaffold with clear data layout and per-step resources
  • Define a validation plan with a tiny end-to-end test and resume/retry checks
  • Pin container/conda versions and validate environment compatibility

Example Use Cases

  • Example: Engine recommendation selecting Nextflow for HPC with reproducible cache and resume
  • Example: Prefect+Dask scaffold designed for local development or a small cluster
  • Example: Prefect on Slurm to orchestrate tasks on an HPC scheduler
  • Example: Nextflow on HPC workflow demonstrating resource-aware scheduling
  • Example: Hybrid workflow blending Prefect+Dask orchestration with Nextflow HPC components

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers