Agent skills
bio-metagenomics-abundance

Agent skill

bio-metagenomics-abundance

Species abundance estimation using Bracken with Kraken2 output. Redistributes reads from higher taxonomic levels to species for more accurate estimates. Use when accurate species-level abundances are needed from Kraken2 classification output.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/abundance-estimation

SKILL.md

Abundance Estimation with Bracken

Basic Abundance Estimation

bash

# Run Bracken on Kraken2 report
bracken -d /path/to/kraken2_db \
    -i kraken_report.txt \
    -o bracken_output.txt \
    -r 150 \                       # Read length (100, 150, 200, 250, 300)
    -l S                           # Taxonomic level

Full Workflow with Kraken2

bash

# Step 1: Classify with Kraken2
kraken2 --db /path/to/kraken2_db \
    --threads 8 \
    --paired \
    --report sample_kraken_report.txt \
    reads_R1.fastq.gz reads_R2.fastq.gz

# Step 2: Estimate abundances with Bracken
bracken -d /path/to/kraken2_db \
    -i sample_kraken_report.txt \
    -o sample_bracken_species.txt \
    -w sample_bracken_report.txt \
    -r 150 \
    -l S

Different Taxonomic Levels

bash

# Species level (default)
bracken -d db -i report.txt -o species.txt -r 150 -l S

# Genus level
bracken -d db -i report.txt -o genus.txt -r 150 -l G

# Family level
bracken -d db -i report.txt -o family.txt -r 150 -l F

# Phylum level
bracken -d db -i report.txt -o phylum.txt -r 150 -l P

Build Bracken Database

bash

# Build Bracken database for specific read lengths
# Run AFTER building Kraken2 database
bracken-build -d /path/to/kraken2_db -t 8 -l 150

# Build for multiple read lengths
bracken-build -d /path/to/kraken2_db -t 8 -l 100
bracken-build -d /path/to/kraken2_db -t 8 -l 250

Output Format

name                    taxonomy_id    taxonomy_lvl    kraken_assigned_reads    added_reads    new_est_reads    fraction_total_reads
Escherichia coli        562           S               5234                     1245           6479             0.52
Staphylococcus aureus   1280          S               2156                     456            2612             0.21

Filter Low-Abundance Taxa

bash

# Use threshold for minimum reads
bracken -d db \
    -i report.txt \
    -o bracken.txt \
    -r 150 \
    -l S \
    -t 10                          # Minimum reads threshold

Combine Multiple Samples

bash

# Run Bracken on each sample
for report in kraken_reports/*.txt; do
    sample=$(basename $report _kraken_report.txt)
    bracken -d db -i $report -o bracken/${sample}_species.txt -r 150 -l S
done

# Combine into abundance matrix
combine_bracken_outputs.py --files bracken/*_species.txt -o combined_abundance.txt

Parse Bracken Output in Python

python

import pandas as pd

bracken = pd.read_csv('bracken_output.txt', sep='\t')

bracken_sorted = bracken.sort_values('new_est_reads', ascending=False)
bracken_sorted[['name', 'fraction_total_reads']].head(20)

total_reads = bracken['new_est_reads'].sum()
bracken['relative_abundance'] = bracken['new_est_reads'] / total_reads * 100

Convert to Relative Abundance

python

import pandas as pd

df = pd.read_csv('bracken_output.txt', sep='\t')

total = df['new_est_reads'].sum()
df['relative_abundance'] = df['new_est_reads'] / total * 100

df.to_csv('bracken_relative_abundance.txt', sep='\t', index=False)

Create Abundance Matrix

python

import pandas as pd
import os

files = [f for f in os.listdir('bracken') if f.endswith('_species.txt')]

dfs = []
for f in files:
    sample = f.replace('_species.txt', '')
    df = pd.read_csv(f'bracken/{f}', sep='\t')
    df = df[['name', 'new_est_reads']].rename(columns={'new_est_reads': sample})
    dfs.append(df)

merged = dfs[0]
for df in dfs[1:]:
    merged = merged.merge(df, on='name', how='outer')

merged = merged.fillna(0)
merged.to_csv('abundance_matrix.txt', sep='\t', index=False)

Key Parameters

Parameter	Description
-d	Kraken2 database path
-i	Input Kraken2 report
-o	Output abundance file
-w	Output updated report (optional)
-r	Read length used
-l	Taxonomic level
-t	Minimum read threshold

Taxonomic Levels

Level	Code	Description
Kingdom	K	Bacteria, Archaea
Phylum	P	Major divisions
Class	C	Class level
Order	O	Order level
Family	F	Family level
Genus	G	Genus level
Species	S	Species level

Read Length Options

Pre-built databases typically include: 50, 75, 100, 150, 200, 250, 300 bp

Choose the length closest to your actual read length.

Related Skills

kraken-classification - Generate Kraken2 report
metaphlan-profiling - Alternative profiling method
metagenome-visualization - Visualize abundances

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/abundance-estimation
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Abundance Estimation with Bracken

Basic Abundance Estimation

Full Workflow with Kraken2

Different Taxonomic Levels

Build Bracken Database

Output Format

Filter Low-Abundance Taxa

Combine Multiple Samples

Parse Bracken Output in Python

Convert to Relative Abundance

Create Abundance Matrix

Key Parameters

Taxonomic Levels

Read Length Options

Related Skills

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state