Agent skill
bio-microbiome-amplicon-processing
Amplicon sequence variant (ASV) inference from 16S rRNA or ITS amplicon sequencing using DADA2. Covers quality filtering, error learning, denoising, and chimera removal. Use when processing demultiplexed amplicon FASTQ files to generate an ASV table for downstream analysis.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/amplicon-processing
SKILL.md
Amplicon Processing with DADA2
Complete DADA2 Workflow
library(dada2)
path <- 'raw_reads'
fnFs <- sort(list.files(path, pattern = '_R1_001.fastq.gz', full.names = TRUE))
fnRs <- sort(list.files(path, pattern = '_R2_001.fastq.gz', full.names = TRUE))
sample_names <- sapply(strsplit(basename(fnFs), '_'), `[`, 1)
# Quality profiles
plotQualityProfile(fnFs[1:2])
plotQualityProfile(fnRs[1:2])
Quality Filtering and Trimming
filtFs <- file.path('filtered', paste0(sample_names, '_F_filt.fastq.gz'))
filtRs <- file.path('filtered', paste0(sample_names, '_R_filt.fastq.gz'))
names(filtFs) <- sample_names
names(filtRs) <- sample_names
# Filter parameters depend on amplicon region and read length
out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs,
truncLen = c(240, 160), # Trim to quality scores
maxN = 0, # No ambiguous bases
maxEE = c(2, 2), # Max expected errors
truncQ = 2, # Truncate at first Q <= 2
rm.phix = TRUE, # Remove PhiX
compress = TRUE,
multithread = TRUE)
Error Rate Learning
errF <- learnErrors(filtFs, multithread = TRUE)
errR <- learnErrors(filtRs, multithread = TRUE)
# Visualize error rates
plotErrors(errF, nominalQ = TRUE)
Sample Inference (Denoising)
dadaFs <- dada(filtFs, err = errF, multithread = TRUE)
dadaRs <- dada(filtRs, err = errR, multithread = TRUE)
# Check results
dadaFs[[1]]
Merge Paired Reads
mergers <- mergePairs(dadaFs, filtFs, dadaRs, filtRs, verbose = TRUE)
# Check merge success
head(mergers[[1]])
Construct Sequence Table
seqtab <- makeSequenceTable(mergers)
dim(seqtab)
# Check length distribution
table(nchar(getSequences(seqtab)))
Remove Chimeras
seqtab_nochim <- removeBimeraDenovo(seqtab, method = 'consensus',
multithread = TRUE, verbose = TRUE)
# Percentage retained
sum(seqtab_nochim) / sum(seqtab)
Track Reads Through Pipeline
getN <- function(x) sum(getUniques(x))
track <- cbind(out, sapply(dadaFs, getN), sapply(dadaRs, getN),
sapply(mergers, getN), rowSums(seqtab_nochim))
colnames(track) <- c('input', 'filtered', 'denoisedF', 'denoisedR', 'merged', 'nonchim')
rownames(track) <- sample_names
track
ITS-Specific Processing
# For ITS, use cutadapt to remove primers first (variable length amplicons)
# Then skip truncLen (don't truncate ITS to fixed length)
out_its <- filterAndTrim(fnFs, filtFs, fnRs, filtRs,
maxN = 0, maxEE = c(2, 2), truncQ = 2,
minLen = 50, # Minimum length
rm.phix = TRUE, compress = TRUE, multithread = TRUE)
Related Skills
- taxonomy-assignment - Assign taxonomy to ASVs
- read-qc/quality-reports - Pre-DADA2 quality assessment
- diversity-analysis - Analyze ASV table
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?