Agent skill
bio-methylation-calling
Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.
Install this agent skill to your Project
npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-methylation-calling
SKILL.md
Version Compatibility
Reference examples tested with: pandas 2.2+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
pip show <package>thenhelp(module.function)to check signatures - CLI:
<tool> --versionthen<tool> --helpto confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Methylation Calling
"Extract methylation calls from my Bismark BAM" → Generate per-cytosine methylation reports (CpG, CHG, CHH contexts) from aligned bisulfite sequencing data.
- CLI:
bismark_methylation_extractor --bedGraph --cytosine_report sample.bam
Basic Extraction
# Extract methylation calls from Bismark BAM
bismark_methylation_extractor --gzip --bedGraph \
sample_bismark_bt2.bam
Paired-End Extraction
bismark_methylation_extractor --paired-end --gzip --bedGraph \
sample_bismark_bt2_pe.bam
Common Options
bismark_methylation_extractor \
--paired-end \ # For paired-end data
--gzip \ # Compress output
--bedGraph \ # Generate bedGraph file
--cytosine_report \ # Genome-wide cytosine report
--genome_folder /path/to/genome/ \ # Required for cytosine_report
--buffer_size 10G \ # Memory buffer
--parallel 4 \ # Parallel extraction
-o output_dir/ \
sample.bam
CpG Context Only
# Most common - extract only CpG methylation
bismark_methylation_extractor \
--paired-end \
--no_overlap \ # Avoid double counting overlapping reads
--gzip \
--bedGraph \
--CX \ # Also extract CHG/CHH (optional)
sample.bam
Genome-Wide Cytosine Report
# Comprehensive report with all CpGs in genome
bismark_methylation_extractor \
--paired-end \
--gzip \
--bedGraph \
--cytosine_report \
--genome_folder /path/to/genome/ \
sample.bam
Strand-Specific Output
# Default: strand-specific output
# CpG_OT_sample.txt - Original Top strand
# CpG_OB_sample.txt - Original Bottom strand
# CpG_CTOT_sample.txt - Complementary to OT
# CpG_CTOB_sample.txt - Complementary to OB
# Merge strands (CpG methylation is usually symmetric)
bismark_methylation_extractor --merge_non_CpG --gzip sample.bam
Avoid Double-Counting Overlapping Reads
# For paired-end data with overlapping reads
bismark_methylation_extractor \
--paired-end \
--no_overlap \ # Ignore overlapping portion of read 2
--gzip \
sample_pe.bam
Generate Coverage File
# bismark2bedGraph creates coverage file
bismark_methylation_extractor --bedGraph --gzip sample.bam
# Or run separately
bismark2bedGraph -o sample CpG_context_sample.txt.gz
# Coverage format: chr start end methylation_percentage count_meth count_unmeth
Convert to BigWig for Visualization
# bedGraph to BigWig (requires UCSC tools)
bedGraphToBigWig sample.bedGraph.gz chrom.sizes sample.bw
M-Bias Plot
# Check for methylation bias across read positions
bismark_methylation_extractor --paired-end \
--mbias_only \ # Only generate M-bias plot
sample.bam
# Generates sample.M-bias.txt and sample.M-bias_R1.png, sample.M-bias_R2.png
Ignore End Bias
# Ignore positions with systematic bias (found from M-bias plot)
bismark_methylation_extractor \
--paired-end \
--ignore 2 \ # Ignore first 2 bp of read 1
--ignore_r2 2 \ # Ignore first 2 bp of read 2
--ignore_3prime 2 \ # Ignore last 2 bp of read 1
--ignore_3prime_r2 2 \ # Ignore last 2 bp of read 2
sample.bam
Output Files
# Main output files:
# CpG_context_sample.txt.gz - Per-read CpG methylation
# sample.bismark.cov.gz - Coverage file
# sample.bedGraph.gz - bedGraph for visualization
# sample.CpG_report.txt.gz - Genome-wide CpG report (with --cytosine_report)
# Coverage file format:
# chr start end methylation% count_methylated count_unmethylated
Parse Output in Python
import pandas as pd
cov = pd.read_csv('sample.bismark.cov.gz', sep='\t', header=None,
names=['chr', 'start', 'end', 'meth_pct', 'count_meth', 'count_unmeth'])
cov['coverage'] = cov['count_meth'] + cov['count_unmeth']
cov_filtered = cov[cov['coverage'] >= 10]
Key Parameters
| Parameter | Description |
|---|---|
| --paired-end | Paired-end mode |
| --gzip | Compress output |
| --bedGraph | Generate bedGraph |
| --cytosine_report | Full genome cytosine report |
| --genome_folder | Path to genome (for cytosine_report) |
| --CX | Report CHG/CHH contexts |
| --no_overlap | Avoid counting overlapping reads twice |
| --parallel | Parallel extraction threads |
| --mbias_only | Only M-bias analysis |
| --ignore N | Ignore first N bp of read 1 |
| --ignore_r2 N | Ignore first N bp of read 2 |
Output Formats
| Format | Description | Use Case |
|---|---|---|
| CpG_context | Per-read methylation calls | Detailed analysis |
| .bismark.cov | Per-CpG coverage summary | methylKit input |
| .bedGraph | Methylation track | Genome browser |
| .CpG_report | All genome CpGs | Comprehensive analysis |
Related Skills
- bismark-alignment - Generate input BAM files
- methylkit-analysis - Import coverage files to R
- dmr-detection - Find differentially methylated regions
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
bio-alignment-io
Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
bio-hi-c-analysis-matrix-operations
Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.
Didn't find tool you were looking for?