Agent skill

bio-sam-bam-basics

Stars 2,009
Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-sam-bam-basics

SKILL.md


name: bio-sam-bam-basics description: View, convert, and understand SAM/BAM/CRAM alignment files using samtools and pysam. Use when inspecting alignments, converting between formats, or understanding alignment file structure. tool_type: cli primary_tool: samtools measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

  • read_file
  • run_shell_command

SAM/BAM/CRAM Basics

View and convert alignment files using samtools and pysam.

Format Overview

Format Description Use Case
SAM Text format, human-readable Debugging, small files
BAM Binary compressed SAM Standard storage format
CRAM Reference-based compression Long-term archival, smaller than BAM

SAM Format Structure

@HD VN:1.6 SO:coordinate
@SQ SN:chr1 LN:248956422
@RG ID:sample1 SM:sample1
@PG ID:bwa PN:bwa VN:0.7.17
read1  0   chr1  100  60  50M  *  0  0  ACGT...  FFFF...  NM:i:0

Header lines start with @:

  • @HD - Header metadata (version, sort order)
  • @SQ - Reference sequence dictionary
  • @RG - Read group information
  • @PG - Program used to create file

Alignment fields (tab-separated):

  1. QNAME - Read name
  2. FLAG - Bitwise flag
  3. RNAME - Reference name
  4. POS - 1-based position
  5. MAPQ - Mapping quality
  6. CIGAR - Alignment description
  7. RNEXT - Mate reference
  8. PNEXT - Mate position
  9. TLEN - Template length
  10. SEQ - Read sequence
  11. QUAL - Base qualities
  12. Optional tags (NM:i:0, MD:Z:50, etc.)

samtools view

View BAM as SAM

bash
samtools view input.bam | head

View with Header

bash
samtools view -h input.bam | head -100

View Header Only

bash
samtools view -H input.bam

View Specific Region

bash
samtools view input.bam chr1:1000-2000

Count Alignments

bash
samtools view -c input.bam

Format Conversion

BAM to SAM

bash
samtools view -h -o output.sam input.bam

SAM to BAM

bash
samtools view -b -o output.bam input.sam

BAM to CRAM

bash
samtools view -C -T reference.fa -o output.cram input.bam

CRAM to BAM

bash
samtools view -b -T reference.fa -o output.bam input.cram

Pipe Conversion

bash
samtools view -b input.sam > output.bam

Common Flags

Flag Decimal Meaning
0x1 1 Paired
0x2 2 Proper pair
0x4 4 Unmapped
0x8 8 Mate unmapped
0x10 16 Reverse strand
0x20 32 Mate reverse strand
0x40 64 First in pair
0x80 128 Second in pair
0x100 256 Secondary alignment
0x200 512 Failed QC
0x400 1024 PCR duplicate
0x800 2048 Supplementary

Decode Flags

bash
samtools flags 147
# 0x93 147 PAIRED,PROPER_PAIR,REVERSE,READ2

CIGAR Operations

Op Description
M Alignment match (can be mismatch)
I Insertion to reference
D Deletion from reference
N Skipped region (introns in RNA-seq)
S Soft clipping
H Hard clipping
= Sequence match
X Sequence mismatch

Example: 50M2I30M = 50 bases match, 2 base insertion, 30 bases match

pysam Python Alternative

Open and Iterate

python
import pysam

with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for read in bam:
        print(f'{read.query_name}\t{read.reference_name}:{read.reference_start}')

Access Header

python
with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for sq in bam.header['SQ']:
        print(f'{sq["SN"]}: {sq["LN"]} bp')

Read Alignment Properties

python
with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for read in bam:
        print(f'Name: {read.query_name}')
        print(f'Flag: {read.flag}')
        print(f'Chrom: {read.reference_name}')
        print(f'Pos: {read.reference_start}')  # 0-based
        print(f'MAPQ: {read.mapping_quality}')
        print(f'CIGAR: {read.cigarstring}')
        print(f'Seq: {read.query_sequence}')
        print(f'Qual: {read.query_qualities}')
        break

Check Flag Properties

python
with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for read in bam:
        if read.is_paired and read.is_proper_pair:
            if read.is_reverse:
                strand = '-'
            else:
                strand = '+'
            print(f'{read.query_name} on {strand} strand')

Fetch Region

python
with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for read in bam.fetch('chr1', 1000, 2000):
        print(read.query_name)

Convert BAM to SAM

python
with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('output.sam', 'w', header=infile.header) as outfile:
        for read in infile:
            outfile.write(read)

Convert to CRAM

python
with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('output.cram', 'wc', reference_filename='reference.fa', header=infile.header) as outfile:
        for read in infile:
            outfile.write(read)

Quick Reference

Task samtools pysam
View BAM samtools view file.bam AlignmentFile('file.bam', 'rb')
View header samtools view -H file.bam bam.header
Count reads samtools view -c file.bam sum(1 for _ in bam)
Get region samtools view file.bam chr1:1-1000 bam.fetch('chr1', 0, 1000)
BAM to SAM samtools view -h -o out.sam in.bam Open with 'w' mode
SAM to BAM samtools view -b -o out.bam in.sam Open with 'wb' mode
BAM to CRAM samtools view -C -T ref.fa -o out.cram in.bam Open with 'wc' mode

Related Skills

  • alignment-indexing - Create indices for random access (required for fetch/region queries)
  • alignment-sorting - Sort alignments by coordinate or name
  • alignment-filtering - Filter alignments by flags, quality, regions
  • bam-statistics - Generate statistics from alignment files
  • sequence-io/read-sequences - Parse FASTA/FASTQ input files

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results