Agent skills
bio-phylo-modern-tree-inferenc...

Agent skill

bio-phylo-modern-tree-inference

Stars 2,009

Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-phylo-modern-tree-inference

SKILL.md

name: bio-phylo-modern-tree-inference description: Build maximum likelihood phylogenetic trees using IQ-TREE2 and RAxML-ng. Use when inferring publication-quality trees with model selection, ultrafast bootstrap, or partitioned analyses from sequence alignments. tool_type: cli primary_tool: IQ-TREE2 measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

read_file
run_shell_command

Modern ML Tree Inference

Build maximum likelihood phylogenetic trees with automatic model selection and ultrafast bootstrap.

IQ-TREE2 Basic Usage

bash

# Simple ML tree with automatic model selection
iqtree2 -s alignment.fasta -m MFP -B 1000 -T AUTO

# -s: input alignment
# -m MFP: ModelFinder Plus (automatic model selection + tree inference)
# -B 1000: 1000 ultrafast bootstrap replicates (minimum recommended for publication)
# -T AUTO: automatic thread detection

IQ-TREE2 Output Files

File	Description
`.treefile`	Best ML tree in Newick format
`.iqtree`	Full analysis report with model parameters
`.log`	Run log
`.contree`	Consensus tree with bootstrap support
`.splits.nex`	Bootstrap splits in Nexus format
`.model.gz`	Model parameters
`.bionj`	Initial BIONJ tree
`.mldist`	ML distance matrix
`.ckp.gz`	Checkpoint file for resuming

Model Selection

bash

# ModelFinder only (no tree inference)
iqtree2 -s alignment.fasta -m MF

# Use specific model
iqtree2 -s alignment.fasta -m GTR+G4 -B 1000

# Test only specific models
iqtree2 -s alignment.fasta -m MF -mset GTR,HKY,K2P

# Protein models
iqtree2 -s protein.fasta -m MFP -B 1000 -st AA

Common DNA Substitution Models

Model	Parameters	Use Case
JC	Equal rates	Very simple, rarely appropriate
K2P/K80	Ti/Tv ratio	Simple, some rate variation
HKY	Ti/Tv + base freq	Moderate complexity
GTR	6 rates + base freq	Most general, recommended default
+G4	Gamma rate variation	4 discrete rate categories
+I	Invariant sites	Sites that never change
+R4	FreeRate model	More flexible than Gamma

Ultrafast Bootstrap

bash

# Standard ultrafast bootstrap (UFBoot2)
# B>=1000: Minimum for publication. Use 10000 for final analyses.
iqtree2 -s alignment.fasta -m GTR+G4 -B 1000

# Standard bootstrap (slower but more accurate for small datasets)
iqtree2 -s alignment.fasta -m GTR+G4 -b 100

# SH-aLRT test (fast approximate likelihood ratio test)
iqtree2 -s alignment.fasta -m GTR+G4 -alrt 1000

# Both UFBoot and SH-aLRT
iqtree2 -s alignment.fasta -m GTR+G4 -B 1000 -alrt 1000

Interpreting Bootstrap Values

UFBoot	SH-aLRT	Interpretation
>= 95	>= 80	Strong support
80-94	70-79	Moderate support
< 80	< 70	Weak support

Partitioned Analysis

For multi-gene datasets with different evolutionary rates:

bash

# Create partition file (partitions.nex)
cat > partitions.nex << 'EOF'
#nexus
begin sets;
    charset gene1 = 1-500;
    charset gene2 = 501-1200;
    charset gene3 = 1201-1800;
    charpartition mine = HKY:gene1, GTR:gene2, GTR+G:gene3;
end;
EOF

# Run partitioned analysis
iqtree2 -s concat.fasta -p partitions.nex -m MFP -B 1000

# Edge-linked partition model (proportional branch lengths)
iqtree2 -s concat.fasta -q partitions.nex -m MFP -B 1000

# Edge-unlinked (independent branch lengths per partition)
iqtree2 -s concat.fasta -Q partitions.nex -m MFP -B 1000

RAxML-ng Basic Usage

bash

# Simple ML tree with GTR+G
raxml-ng --all --msa alignment.fasta --model GTR+G --bs-trees 100

# --all: ML search + bootstrapping
# --msa: input alignment
# --model: substitution model
# --bs-trees: number of bootstrap replicates

RAxML-ng Model Specification

bash

# DNA models
raxml-ng --msa alignment.fasta --model GTR+G4+I

# Protein models (automatic detection)
raxml-ng --msa protein.fasta --model LG+G8+F

# Check alignment and determine model
raxml-ng --check --msa alignment.fasta --model GTR+G

RAxML-ng Output Files

File	Description
`.raxml.bestTree`	Best ML tree
`.raxml.support`	Tree with bootstrap support values
`.raxml.bootstraps`	All bootstrap trees
`.raxml.mlTrees`	All ML trees from search
`.raxml.log`	Analysis log
`.raxml.rba`	Binary alignment (for restart)

RAxML-ng Advanced Options

bash

# Multiple ML searches (find global optimum)
# --tree pars{10}: 10 starting parsimony trees recommended for thorough search
raxml-ng --msa alignment.fasta --model GTR+G --tree pars{10} --prefix ml_search

# Constrained tree search
raxml-ng --msa alignment.fasta --model GTR+G --tree-constraint constraint.tre

# Site likelihoods for topology tests
raxml-ng --sitelh --msa alignment.fasta --model GTR+G --tree candidate.tre

Comparing IQ-TREE2 vs RAxML-ng

Feature	IQ-TREE2	RAxML-ng
Model selection	Built-in ModelFinder	External (ModelTest-NG)
Ultrafast bootstrap	Yes (UFBoot2)	No
Standard bootstrap	Yes	Yes
Partition models	Extensive	Good
Speed	Faster for UFBoot	Faster for standard BS
Memory	Lower	Higher
Checkpointing	Yes	Yes

Large Dataset Strategies

bash

# IQ-TREE2 with reduced memory
iqtree2 -s large.fasta -m GTR+G -B 1000 -T 4 -mem 8G

# Use approximate NNI search
iqtree2 -s large.fasta -m GTR+G -B 1000 -fast

# RAxML-ng with parsimony starting trees
raxml-ng --msa large.fasta --model GTR+G --tree pars{5} --threads 8

Tree Topology Tests

bash

# IQ-TREE2: AU test comparing trees
iqtree2 -s alignment.fasta -m GTR+G -z trees.nwk -n 0 -zb 10000 -au

# Output interpretation:
# p-AU < 0.05: Reject tree
# p-AU >= 0.05: Cannot reject tree

Constrained Analysis

bash

# IQ-TREE2: Enforce monophyly constraint
iqtree2 -s alignment.fasta -m GTR+G -g constraint.tre -B 1000

# Constraint file format (Newick with taxa to constrain):
# ((Human,Chimp),Gorilla);

Complete Workflow Example

bash

# 1. Check alignment
iqtree2 -s alignment.fasta -m GTR+G -n 0

# 2. Find best model
iqtree2 -s alignment.fasta -m MF -T AUTO

# 3. Full analysis with best model
iqtree2 -s alignment.fasta -m GTR+I+G4 -B 1000 -alrt 1000 -T AUTO

# 4. Visualize result
cat alignment.fasta.treefile

Resuming Interrupted Runs

bash

# IQ-TREE2: Resume from checkpoint
iqtree2 -s alignment.fasta -m GTR+G -B 1000 --redo-tree

# RAxML-ng: Resume
raxml-ng --msa alignment.fasta --model GTR+G --redo

Reproducibility

bash

# Set random seed for reproducible results
# seed=12345: Any fixed seed ensures reproducibility across runs
iqtree2 -s alignment.fasta -m GTR+G -B 1000 --seed 12345

raxml-ng --msa alignment.fasta --model GTR+G --seed 12345 --bs-trees 100

Related Skills

tree-io - Read and convert output tree files
tree-visualization - Visualize trees with bootstrap support
distance-calculations - Compare with distance-based methods
alignment/alignment-io - Prepare alignments for tree inference

Maintainer

FreedomIntelligence Core maintainer

Source details

Full Name: FreedomIntelligence/OpenClaw-Medical-Skills
Branch: main
Path in repo: skills/bio-phylo-modern-tree-inference
Topics: claude-code skills openclaw awesome clawhub openclaw-skills medical nanoclaw

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量，并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Modern ML Tree Inference

IQ-TREE2 Basic Usage

IQ-TREE2 Output Files

Model Selection

Common DNA Substitution Models

Ultrafast Bootstrap

Interpreting Bootstrap Values

Partitioned Analysis

RAxML-ng Basic Usage

RAxML-ng Model Specification

RAxML-ng Output Files

RAxML-ng Advanced Options

Comparing IQ-TREE2 vs RAxML-ng

Large Dataset Strategies

Tree Topology Tests

Constrained Analysis

Complete Workflow Example

Resuming Interrupted Runs

Reproducibility

Related Skills

Recommended Agent Skills

vcf-annotator

chemist-analyst

bio-alignment-io

sleep-analyzer

metabolomics-workbench-database

bio-hi-c-analysis-matrix-operations