Agent skill
bio-longread-alignment
Align long reads using minimap2 for Oxford Nanopore and PacBio data. Supports various presets for different read types and applications. Use when aligning ONT or PacBio reads to a reference genome for variant calling, SV detection, or coverage analysis.
Install this agent skill to your Project
npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-longread-alignment
SKILL.md
Version Compatibility
Reference examples tested with: minimap2 2.26+, samtools 1.19+
Before using code patterns, verify installed versions match. If versions differ:
- CLI:
<tool> --versionthen<tool> --helpto confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Long-Read Alignment with minimap2
"Align my long reads to the reference" → Map ONT or PacBio reads using minimap2 with technology-specific presets for optimal sensitivity and accuracy.
- CLI:
minimap2 -ax map-ont ref.fa reads.fq | samtools sort -o aligned.bam(ONT),minimap2 -ax map-hifi(PacBio HiFi)
Oxford Nanopore Alignment
# Basic ONT alignment
minimap2 -ax map-ont reference.fa reads.fastq.gz | \
samtools sort -o aligned.bam
samtools index aligned.bam
PacBio HiFi Alignment
# PacBio HiFi reads (high accuracy)
minimap2 -ax map-hifi reference.fa reads.fastq.gz | \
samtools sort -o aligned.bam
samtools index aligned.bam
PacBio CLR Alignment
# PacBio CLR (continuous long reads, lower accuracy)
minimap2 -ax map-pb reference.fa reads.fastq.gz | \
samtools sort -o aligned.bam
samtools index aligned.bam
Pre-Build Index for Multiple Runs
# Build index once
minimap2 -d reference.mmi reference.fa
# Use index for alignment
minimap2 -ax map-ont reference.mmi reads.fastq.gz | samtools sort -o aligned.bam
Common Options
minimap2 -ax map-ont \
-t 8 \ # Threads
-R '@RG\tID:sample\tSM:sample' \ # Read group
--secondary=no \ # No secondary alignments
--MD \ # Generate MD tag for variants
-Y \ # Use soft clipping for supplementary
reference.fa reads.fastq.gz | \
samtools sort -@ 4 -o aligned.bam
Splice-Aware Alignment (RNA)
# For direct RNA or cDNA sequencing
minimap2 -ax splice reference.fa reads.fastq.gz | \
samtools sort -o aligned.bam
With Junction BED (Known Splice Sites)
# Provide known splice junctions
minimap2 -ax splice --junc-bed junctions.bed \
reference.fa reads.fastq.gz | samtools sort -o aligned.bam
Assembly to Reference Alignment
# Assembly with ~0.1% divergence
minimap2 -ax asm5 reference.fa assembly.fa > aligned.sam
# Assembly with higher divergence (~5%)
minimap2 -ax asm20 reference.fa assembly.fa > aligned.sam
Output PAF (Faster, No BAM)
# PAF format (faster, for quick analysis)
minimap2 -x map-ont reference.fa reads.fastq.gz > alignments.paf
Keep Secondary and Supplementary
# Keep all alignments (for SV calling)
minimap2 -ax map-ont \
--secondary=yes \
-N 5 \ # Max secondary alignments
reference.fa reads.fastq.gz | samtools sort -o aligned.bam
Filter Alignments
# During alignment pipeline
minimap2 -ax map-ont reference.fa reads.fastq.gz | \
samtools view -b -q 10 | \ # Min mapping quality 10
samtools sort -o aligned.bam
Multiple FASTQ Files
# Concatenate inputs
minimap2 -ax map-ont reference.fa reads1.fastq.gz reads2.fastq.gz | \
samtools sort -o aligned.bam
# Or use file list
cat file_list.txt | xargs minimap2 -ax map-ont reference.fa | \
samtools sort -o aligned.bam
Output Statistics
# Get alignment statistics
samtools flagstat aligned.bam
# Detailed stats
samtools stats aligned.bam | grep ^SN
Convert PAF to BED
# Extract alignments to BED
awk 'OFS="\t" {print $6, $8, $9, $1, $12, ($5=="+")?"+":"-"}' alignments.paf > alignments.bed
Key Presets
| Preset | Description | Best For |
|---|---|---|
| map-ont | ONT reads | Nanopore genomic |
| map-hifi | PacBio HiFi | PacBio genomic |
| map-pb | PacBio CLR | PacBio CLR |
| splice | Long RNA reads | cDNA, direct RNA |
| asm5 | Low divergence | Same species assembly |
| asm20 | High divergence | Cross-species assembly |
| sr | Short reads | Illumina (basic) |
Key Parameters
| Parameter | Default | Description |
|---|---|---|
| -t | 3 | CPU threads |
| -k | 15 | K-mer size |
| -w | 10 | Minimizer window |
| -a | off | Output SAM |
| -x | none | Preset |
| --secondary | yes | Output secondary |
| -N | 5 | Max secondary alignments |
| --MD | off | Generate MD tag |
| -R | none | Read group header |
| -Y | off | Soft clipping for supplementary |
Output Formats
| Format | Flag | Description |
|---|---|---|
| PAF | (default) | Pairwise Alignment Format |
| SAM | -a | Sequence Alignment Map |
| BAM | -a | samtools | Binary SAM |
Related Skills
- medaka-polishing - Polish consensus with medaka
- structural-variants - Call SVs from alignments
- alignment-files/sam-bam-basics - BAM manipulation
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
bio-alignment-io
Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
bio-hi-c-analysis-matrix-operations
Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.
Didn't find tool you were looking for?