Get the FREE Ultimate OpenClaw Setup Guide →

bio-assembly-qc

npx machina-cli add skill fmschulz/omics-skills/bio-assembly-qc --openclaw
Files (1)
SKILL.md
1.8 KB

Bio Assembly QC

Assemble genomes/metagenomes and produce assembly QC artifacts.

Instructions

  1. Select assembler based on read type and genome size.
  2. Run assembly with resource-aware settings.
  3. Run QUAST/MetaQUAST and summarize metrics.

Quick Reference

TaskAction
Run workflowFollow the steps in this skill and capture outputs.
Validate inputsConfirm required inputs and reference data exist.
Review outputsInspect reports and QC gates before proceeding.
Tool docsSee docs/README.md.
References- See ../bio-skills-references.md

Input Requirements

Prerequisites:

  • Tools available in the active environment (Pixi/conda/system). See docs/README.md for expected tools.
  • Sufficient disk and RAM for chosen assembler. Inputs:
  • reads/*.fastq.gz (raw reads).
  • assembler choice (spades | flye).

Output

  • results/bio-assembly-qc/contigs.fasta
  • results/bio-assembly-qc/assembly_metrics.tsv
  • results/bio-assembly-qc/qc_report.html
  • results/bio-assembly-qc/logs/

Quality Gates

  • Assembly size range and N50 distribution meet project thresholds.
  • On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.
  • Verify reads are present and gzip-readable.
  • Check available disk space before assembly.

Examples

Example 1: Expected input layout

reads/*.fastq.gz (raw reads).
assembler choice (spades | flye).

Troubleshooting

Issue: Missing inputs or reference databases Solution: Verify paths and permissions before running the workflow.

Issue: Low-quality results or failed QC gates Solution: Review reports, adjust parameters, and re-run the affected step.

Source

git clone https://github.com/fmschulz/omics-skills/blob/main/skills/bio-assembly-qc/SKILL.mdView on GitHub

Overview

This skill guides assembling genomes or metagenomes and producing QC artifacts. It emphasizes selecting an assembler based on read type and genome size, running with resource-aware settings, and generating QC metrics via QUAST/MetaQUAST.

How This Skill Works

Verify inputs and choose an appropriate assembler: SPAdes for short reads or Flye for long reads/metagenomes. Run the assembler with resource-aware settings, then execute QUAST or MetaQUAST to generate a summary of metrics and a QC report (qc_report.html) alongside contigs and logs.

When to Use It

  • Starting a genome or metagenome assembly and needing QC artifacts
  • Choosing between SPAdes (short reads) and Flye (long reads/metagenomes)
  • Verifying inputs and ensuring enough disk space and RAM before assembly
  • Generating QUAST/MetaQUAST reports to gate quality
  • Re-running with adjusted parameters after QC gates fail

Quick Start

  1. Step 1: Validate inputs (reads/*.fastq.gz) and available tools (Pixi/conda/system).
  2. Step 2: Choose SPAdes for short reads or Flye for long reads/metagenomes; ensure resources.
  3. Step 3: Run the assembler, then run QUAST/MetaQUAST and collect qc_report.html and metrics.

Best Practices

  • Validate inputs exist and are gzip-readable
  • Select assembler based on read type and genome size (SPAdes for short reads, Flye for long reads/metagenomes)
  • Ensure sufficient disk and RAM; monitor resource usage during assembly
  • Run QUAST/MetaQUAST and review metrics (size, N50, contig counts) before proceeding
  • Check QC gates and retry with parameter tweaks if needed

Example Use Cases

  • Assembling a bacterial genome from short reads with SPAdes and generating qc_report.html
  • Metagenome assembly from mixed reads using Flye and validating with MetaQUAST
  • Comparing assembly metrics across parameter sets and recording results in reports
  • Validating that reads are present and gzipped before launch
  • Troubleshooting QC failures by adjusting assembler parameters and re-running

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers