Agent skills
bio-prefect-dask-nextflow

Agent skill

bio-prefect-dask-nextflow

Designs and scaffolds bioinformatics pipelines using Prefect (Python) with Dask for local/distributed task execution and Nextflow for HPC scheduler-native execution. Use when an agent must choose between Prefect+Dask vs Nextflow, generate runnable project skeletons, or adapt workflows for laptops, workstations, and HPC clusters (e.g., Slurm/PBS) with reproducibility, caching/resume, and resource-aware configuration.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/bio-prefect-dask-nextflow

SKILL.md

Bio Prefect + Dask + Nextflow

This skill helps an agent design, scaffold, and harden bioinformatics pipelines across:

Local workstation/laptop (Prefect + Dask LocalCluster)
HPC clusters (Nextflow executors; or Prefect → Slurm worker patterns)
Hybrid patterns (Prefect orchestrates metadata/approvals/notifications; Nextflow runs heavy compute)

Use this skill when

The user mentions Prefect, Dask, Nextflow, HPC, Slurm, PBS, nf-core, or “bioinformatics pipeline”.
The user needs parallelism, distributed execution, retries, scheduling, reproducible runs, or “local prototype then scale”.

Outputs this skill should produce

When activated, the agent should return (and/or generate in a repo):

Engine choice: prefect+dask, nextflow, or hybrid, with rationale.
Runnable scaffold (files + commands) for the chosen engine.
Resource plan per step (cpus/mem/time) + I/O layout (scratch vs shared).
Validation plan: tiny test run + failure/retry + resume test.
Pitfalls & mitigations: what will likely break on HPC and why.

2-minute decision

Use the decision matrix for nuance: decision-matrix.md

Default heuristics:

Choose Nextflow if the pipeline is mainly CLI tools over files, must run on HPC schedulers, and reproducibility/caching are top priorities.
Choose Prefect + Dask if the pipeline is mainly Python functions, needs dynamic branching, API/DB integration, or interactive development.
Choose Hybrid if Prefect should own the “outer loop” (metadata, batching, approvals, notifications) while Nextflow owns “inner loop” compute.

Standard workflow (agent playbook)

Requirements intake
- Scheduler type (Slurm/PBS/LSF/etc), queue/partition rules, walltime limits, node topology.
- Container policy (Docker vs Singularity/Apptainer vs no containers) and module/conda availability.
- Data location and throughput constraints (shared FS vs scratch, object storage).
- Parallelism shape (many independent samples? big distributed arrays? long single jobs?).
Choose engine using decision-matrix.md; state assumptions.
Scaffold the project
- Prefect path → prefect-dask.md and (if needed) prefect-hpc-slurm.md
- Nextflow path → nextflow-hpc.md
Implement steps with replayable boundaries
- Each step idempotent; deterministic output paths.
- Pass around paths/URIs, not giant in-memory objects.
Add operational glue
- Logging, retries/timeouts, resource hints, output manifest.
Validate locally
- Tiny dataset run + forced failure + resume/retry test.
Scale to HPC
- Confirm filesystem layout, job submission permissions, and environment bootstrap.

Response template

Use this template in your final answer to the user:

markdown

# Pipeline plan: [name]

## Recommended engine
- Choice: [prefect+dask | nextflow | hybrid]
- Why: [3–6 bullet rationale]

## Project scaffold
- Files to create:
  - ...
- Commands to run:
  - ...

## Execution model
- Parallelism strategy:
- Resource plan (per step):
- Data layout (work/results/cache):

## Pitfalls & mitigations
- ...

## Validation checklist
- ...

Deep references (read as needed)

Engine comparison and “when to use what”: decision-matrix.md
Prefect + Dask local patterns: prefect-dask.md
Prefect on Slurm + Dask-on-HPC options: prefect-hpc-slurm.md
Nextflow on HPC (executors, modules, resume/cache): nextflow-hpc.md
Examples (Prefect-only, Nextflow-only, Hybrid): examples.md
Validation loop + common failure modes: validation-checklist.md

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/bio-prefect-dask-nextflow
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Bio Prefect + Dask + Nextflow

Use this skill when

Outputs this skill should produce

2-minute decision

Standard workflow (agent playbook)

Response template

Deep references (read as needed)

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state