Agent skill

bio-read-qc-adapter-trimming

Remove sequencing adapters from FASTQ files using Cutadapt and Trimmomatic. Supports single-end and paired-end reads, Illumina TruSeq, Nextera, and custom adapter sequences. Use when FastQC shows adapter contamination or before alignment of short reads.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/adapter-trimming

SKILL.md

Adapter Trimming

Remove sequencing adapters from reads using Cutadapt (precise, flexible) or Trimmomatic (paired-end optimized).

Common Adapter Sequences

Platform/Kit Adapter Sequence
Illumina TruSeq Read 1 3' AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Illumina TruSeq Read 2 3' AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
Nextera Transposase CTGTCTCTTATACACATCT
Small RNA 3' adapter TGGAATTCTCGGGTGCCAAGG
Poly-A Poly-A tail AAAAAAAAAAAAAAAA

Cutadapt

Single-End Reads

bash
# 3' adapter (most common)
cutadapt -a AGATCGGAAGAGC -o trimmed.fastq.gz sample.fastq.gz

# 5' adapter
cutadapt -g ACGTACGT -o trimmed.fastq.gz sample.fastq.gz

# Both ends
cutadapt -a ADAPTER1 -g ADAPTER2 -o trimmed.fastq.gz sample.fastq.gz

# Multiple adapters (tries each)
cutadapt -a ADAPTER1 -a ADAPTER2 -a ADAPTER3 -o trimmed.fastq.gz sample.fastq.gz

Paired-End Reads

bash
# Basic paired-end
cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA \
         -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT \
         -o trimmed_R1.fastq.gz -p trimmed_R2.fastq.gz \
         sample_R1.fastq.gz sample_R2.fastq.gz

# Short form for Illumina TruSeq (auto-detect)
cutadapt -a AGATCGGAAGAGC -A AGATCGGAAGAGC \
         -o trimmed_R1.fastq.gz -p trimmed_R2.fastq.gz \
         sample_R1.fastq.gz sample_R2.fastq.gz

Adapter Options

bash
# Error rate (default 0.1 = 10% mismatches allowed)
cutadapt -a ADAPTER -e 0.15 -o out.fq in.fq

# Minimum overlap (default 3)
cutadapt -a ADAPTER -O 5 -o out.fq in.fq

# No indels in adapter alignment
cutadapt -a ADAPTER --no-indels -o out.fq in.fq

# Trim Ns from ends
cutadapt --trim-n -o out.fq in.fq

# Anchored adapters (must be at end)
cutadapt -a ADAPTER$ -o out.fq in.fq

Linked Adapters

bash
# 5' adapter followed by 3' adapter (same read)
cutadapt -a ADAPTER1...ADAPTER2 -o out.fq in.fq

# Anchored 5' linked to 3'
cutadapt -a ^ADAPTER1...ADAPTER2 -o out.fq in.fq

Filtering After Trimming

bash
# Minimum length (discard shorter)
cutadapt -a ADAPTER -m 20 -o out.fq in.fq

# Maximum length
cutadapt -a ADAPTER -M 150 -o out.fq in.fq

# Maximum N content
cutadapt -a ADAPTER --max-n 0.1 -o out.fq in.fq

# Discard trimmed reads
cutadapt -a ADAPTER --discard-trimmed -o out.fq in.fq

# Discard untrimmed reads
cutadapt -a ADAPTER --discard-untrimmed -o out.fq in.fq

Paired-End Filtering

bash
# Both reads must pass minimum length
cutadapt -a ADAPT1 -A ADAPT2 -m 20 \
         -o R1.fq -p R2.fq in_R1.fq in_R2.fq

# Output too-short reads separately
cutadapt -a ADAPT1 -A ADAPT2 -m 20 \
         --too-short-output short_R1.fq --too-short-paired-output short_R2.fq \
         -o R1.fq -p R2.fq in_R1.fq in_R2.fq

Action Options

bash
# Mask adapter instead of trim (replace with N)
cutadapt -a ADAPTER --action=mask -o out.fq in.fq

# Retain adapter but lowercase
cutadapt -a ADAPTER --action=lowercase -o out.fq in.fq

# Just find adapters, don't modify
cutadapt -a ADAPTER --action=none -o out.fq in.fq

Trimmomatic

Single-End Mode

bash
trimmomatic SE -phred33 \
    input.fastq.gz output.fastq.gz \
    ILLUMINACLIP:adapters.fa:2:30:10

Paired-End Mode

bash
trimmomatic PE -phred33 -threads 4 \
    input_R1.fastq.gz input_R2.fastq.gz \
    output_R1_paired.fastq.gz output_R1_unpaired.fastq.gz \
    output_R2_paired.fastq.gz output_R2_unpaired.fastq.gz \
    ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10

ILLUMINACLIP Parameters

bash
ILLUMINACLIP:<fastaWithAdapters>:<seed>:<palindrome>:<simple>

# Parameters:
# seed - max mismatches in 16bp seed (usually 2)
# palindrome - threshold for palindrome match (usually 30)
# simple - threshold for simple match (usually 10)

# Example with all options
ILLUMINACLIP:adapters.fa:2:30:10:2:keepBothReads

Built-in Adapter Files

Trimmomatic includes adapter files:

  • TruSeq2-SE.fa - TruSeq v2 single-end
  • TruSeq2-PE.fa - TruSeq v2 paired-end
  • TruSeq3-SE.fa - TruSeq v3 single-end
  • TruSeq3-PE.fa - TruSeq v3 paired-end
  • TruSeq3-PE-2.fa - TruSeq v3 PE (palindrome mode)
  • NexteraPE-PE.fa - Nextera paired-end

Find Trimmomatic Adapters

bash
# Find adapter directory
TRIMMOMATIC_JAR=$(which trimmomatic | xargs dirname)/../share/trimmomatic-*/adapters/

# Or with conda
ls $CONDA_PREFIX/share/trimmomatic-*/adapters/

Performance

bash
# Cutadapt with multiple cores
cutadapt -j 8 -a ADAPTER -o out.fq in.fq

# Trimmomatic threads
trimmomatic PE -threads 8 ...

Verify Trimming

bash
# Check adapter removal with FastQC
fastqc trimmed.fastq.gz

# Count reads before/after
zcat input.fastq.gz | wc -l
zcat trimmed.fastq.gz | wc -l

Related Skills

  • quality-reports - Check adapter content with FastQC
  • quality-filtering - Quality trimming after adapter removal
  • fastp-workflow - Combined adapter and quality trimming

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results