Replicates Incorporation Skill

Overview

This skill provides two modes for replicates incorporation:

Refer to the Inputs & Outputs section to check inputs and build the output architecture. All the output file should located in ${proj_dir} in Step 0.
Always use filtered BAM file (*.filtered.bam) if available.
Always prompt user for whether generate psedo-replicates if more then 2 replicates.
Pre-Peak Calling (BAM Mode): If provided with >2 biological replicates, it merges all BAMs, generate the merge BAM file to prepare for track generation and splits them into 2 balanced "pseudo-replicates" to prepare for peak calling only if user required.
Post-Peak Calling (Peak Mode): If provided with peak files (only support two replicates, derived from either 2 true replicates or 2 pseudo-replicates), it performs IDR (Irreproducible Discovery Rate) analysis, filters non-reproducible peaks, and generates a final "conservative" or "optimal" consensus peak set

Call:

with:

The tool will:

Create all_rep_merge directory.
Return the full path of the all_rep_merge directory, which will be used as ${proj_dir}

Call:

Call: (call this only when more than two replicates are provided and user prompt for generating pseudo replicates)

mcp__bw_tools__split_pseudo_replicates with: bam_file: ${proj_dir}/temp/${sample}.pooled.bam output_rep1: ${proj_dir}/temp/${sample}.pseudo1.bam output_rep2: ${proj_dir}/temp/${sample}.pseudo2.bam

A. Narrow Peaks / ATAC (IDR) Use this to combine reproducible peaks. You should ideally run IDR on:

Call:

B. Broad Peaks (Consensus) Call:

mcp__bw_tools__merge_consensus_peaks with: peak_file_a: Path to Replicate 1 broadPeak file. peak_file_b: Path to Replicate 2 broadPeak file. output_peak: ${proj_dir}/peaks/${sample}.consensus.broadPeaks overlap_fraction: 0.5