Agent skill
bio-phylo-tree-manipulation
Install this agent skill to your Project
npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-phylo-tree-manipulation
SKILL.md
name: bio-phylo-tree-manipulation description: Modify phylogenetic tree structure using Biopython Bio.Phylo. Use when rooting trees with outgroups or midpoint, pruning taxa, collapsing clades, ladderizing branches, or extracting subtrees. tool_type: python primary_tool: Bio.Phylo measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Tree Manipulation
Modify phylogenetic tree structure: rooting, pruning, ladderizing, and subtree extraction.
Required Import
from Bio import Phylo
from io import StringIO
Rooting Trees
Root with Outgroup
tree = Phylo.read('tree.nwk', 'newick')
# Root with single taxon
tree.root_with_outgroup({'name': 'Outgroup'})
# Root with multiple taxa (must be monophyletic)
outgroup = [{'name': 'TaxonA'}, {'name': 'TaxonB'}]
if tree.is_monophyletic(outgroup):
tree.root_with_outgroup(*outgroup)
else:
print('Outgroup is not monophyletic')
Root at Midpoint
tree = Phylo.read('tree.nwk', 'newick')
tree.root_at_midpoint()
Check Rooting Status
# Check if tree is rooted (bifurcating at root)
print(f'Is bifurcating: {tree.is_bifurcating()}')
# Count children of root
root = tree.root
print(f'Root has {len(root.clades)} children')
# 2 children = rooted, 3+ children = unrooted
Ladderizing
Sort clades for consistent visual presentation.
tree = Phylo.read('tree.nwk', 'newick')
# Larger clades at bottom
tree.ladderize()
# Larger clades at top
tree.ladderize(reverse=True)
Phylo.write(tree, 'ladderized.nwk', 'newick')
Pruning Trees
Remove Specific Taxa
tree = Phylo.read('tree.nwk', 'newick')
# Find and remove a taxon
target = tree.find_any(name='TaxonToRemove')
if target:
tree.prune(target)
# Remove multiple taxa
for name in ['TaxonA', 'TaxonB', 'TaxonC']:
target = tree.find_any(name=name)
if target:
tree.prune(target)
Keep Only Specified Taxa
tree = Phylo.read('tree.nwk', 'newick')
keep_taxa = {'Human', 'Chimp', 'Gorilla'}
terminals = tree.get_terminals()
for term in terminals:
if term.name not in keep_taxa:
tree.prune(term)
Collapsing Clades
Collapse branches below a threshold.
tree = Phylo.read('tree.nwk', 'newick')
# Collapse single clade
target = tree.find_any(name='SomeInternalNode')
if target:
tree.collapse(target)
# Collapse all clades matching criteria (branch length threshold)
tree.collapse_all(lambda c: c.branch_length and c.branch_length < 0.01)
# Collapse all poorly-supported nodes
tree.collapse_all(lambda c: c.confidence is not None and c.confidence < 70)
Extracting Subtrees
Get Clade as Subtree
tree = Phylo.read('tree.nwk', 'newick')
# Find common ancestor of taxa
clade = tree.common_ancestor({'name': 'Human'}, {'name': 'Chimp'})
# The clade itself can be treated as a subtree
Phylo.draw_ascii(clade)
# Get all terminals in this clade
subtree_taxa = [t.name for t in clade.get_terminals()]
print(f'Subtree contains: {subtree_taxa}')
Extract Subtree by Common Ancestor
tree = Phylo.read('tree.nwk', 'newick')
# Find MRCA (Most Recent Common Ancestor)
taxa = [{'name': 'Human'}, {'name': 'Chimp'}, {'name': 'Gorilla'}]
mrca = tree.common_ancestor(*taxa)
print(f'MRCA branch length: {mrca.branch_length}')
Tree Traversal
tree = Phylo.read('tree.nwk', 'newick')
# Iterate all clades (preorder by default)
for clade in tree.find_clades():
print(clade.name, clade.branch_length)
# Level-order traversal (breadth-first)
for clade in tree.find_clades(order='level'):
print(clade.name)
# Postorder traversal
for clade in tree.find_clades(order='postorder'):
print(clade.name)
# Only terminal nodes
for term in tree.get_terminals():
print(term.name)
# Only internal nodes
for internal in tree.get_nonterminals():
print(internal)
Finding Clades
tree = Phylo.read('tree.nwk', 'newick')
# Find by name
clade = tree.find_any(name='Human')
# Find all matching criteria
matches = tree.find_clades(branch_length=lambda x: x and x > 0.5)
for m in matches:
print(f'{m.name}: {m.branch_length}')
# Find by terminal status
terminals = list(tree.find_clades(terminal=True))
internals = list(tree.find_clades(terminal=False))
Getting Path Between Nodes
tree = Phylo.read('tree.nwk', 'newick')
# Path from root to a node
target = tree.find_any(name='Human')
path = tree.get_path(target)
print(f'Path from root to Human: {len(path)} nodes')
for clade in path:
print(f' {clade.name}: {clade.branch_length}')
# Trace path between any two nodes
human = tree.find_any(name='Human')
mouse = tree.find_any(name='Mouse')
trace = tree.trace(human, mouse)
print(f'Path Human to Mouse: {len(trace)} nodes')
Checking Tree Properties
tree = Phylo.read('tree.nwk', 'newick')
# Check if monophyletic
taxa = [tree.find_any(name='Human'), tree.find_any(name='Chimp')]
taxa = [t for t in taxa if t is not None]
print(f'Is monophyletic: {tree.is_monophyletic(taxa)}')
# Check if bifurcating
print(f'Is bifurcating: {tree.is_bifurcating()}')
# Check if preterminal (parent of only terminals)
for clade in tree.get_nonterminals():
print(f'{clade}: is_preterminal={clade.is_preterminal()}')
Modifying Branch Lengths
tree = Phylo.read('tree.nwk', 'newick')
# Set missing branch lengths
for clade in tree.find_clades():
if clade.branch_length is None:
clade.branch_length = 0.0
# Scale all branch lengths
scale_factor = 100 # Convert to percent divergence
for clade in tree.find_clades():
if clade.branch_length:
clade.branch_length *= scale_factor
# Remove branch lengths (convert to cladogram)
for clade in tree.find_clades():
clade.branch_length = None
Renaming Taxa
tree = Phylo.read('tree.nwk', 'newick')
# Rename individual taxon
target = tree.find_any(name='OldName')
if target:
target.name = 'NewName'
# Batch rename from mapping
name_map = {'Hsap': 'Human', 'Ptro': 'Chimp', 'Mmus': 'Mouse'}
for term in tree.get_terminals():
if term.name in name_map:
term.name = name_map[term.name]
Phylo.write(tree, 'renamed.nwk', 'newick')
Counting Nodes
tree = Phylo.read('tree.nwk', 'newick')
n_terminals = len(tree.get_terminals())
n_internals = len(tree.get_nonterminals())
n_total = tree.count_terminals() + len(tree.get_nonterminals())
print(f'Terminals: {n_terminals}')
print(f'Internal nodes: {n_internals}')
print(f'Total nodes: {n_total}')
Tree Depths
tree = Phylo.read('tree.nwk', 'newick')
# Get depths from root
depths = tree.depths()
for clade, depth in depths.items():
if clade.is_terminal():
print(f'{clade.name}: depth={depth:.3f}')
# Get maximum depth (tree height)
max_depth = max(depths.values())
print(f'Tree height: {max_depth:.3f}')
Splitting Clades
tree = Phylo.read('tree.nwk', 'newick')
# Split a terminal into multiple children
target = tree.find_any(name='TaxonA')
if target and target.is_terminal():
target.split(n=2, branch_length=0.05) # Creates 2 children
# Split with specific branch lengths
target.split(branch_length=[0.1, 0.2, 0.3]) # Creates 3 children
Generating Random Trees
from Bio.Phylo.BaseTree import Tree
# Generate random bifurcating tree
taxa = ['Human', 'Chimp', 'Gorilla', 'Mouse', 'Rat']
random_tree = Tree.randomized(taxa)
Phylo.draw_ascii(random_tree)
# With branch lengths
random_tree = Tree.randomized(taxa, branch_length=1.0)
Quick Reference: Tree Methods
| Method | Description |
|---|---|
root_with_outgroup() |
Reroot using outgroup |
root_at_midpoint() |
Reroot at midpoint |
ladderize() |
Sort branches by size |
prune() |
Remove a clade |
collapse() |
Collapse a clade into polytomy |
collapse_all() |
Collapse all matching clades |
split() |
Split clade into children |
trace() |
Get path between two clades |
Tree.randomized() |
Generate random tree |
common_ancestor() |
Find MRCA of taxa |
find_any() |
Find first matching clade |
find_clades() |
Find all matching clades |
get_path() |
Get path from root to clade |
depths() |
Get depth of all clades |
is_monophyletic() |
Check if taxa form clade |
is_bifurcating() |
Check if tree is binary |
Related Skills
- tree-io - Read and write tree files
- tree-visualization - Draw modified trees
- distance-calculations - Build trees from alignments
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
bio-alignment-io
Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
bio-hi-c-analysis-matrix-operations
Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.
Didn't find tool you were looking for?