Agent skill

bio-variant-calling-clinical-interpretation

Clinical variant interpretation using ClinVar, ACMG guidelines, and pathogenicity predictors. Prioritize variants for diagnostic and research applications. Use when interpreting clinical significance of variants.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/clinical-interpretation

SKILL.md

Clinical Variant Interpretation

Prioritize and interpret variants for clinical significance using databases and ACMG/AMP guidelines.

Interpretation Framework

Annotated VCF
    │
    ├── Database Lookup
    │   ├── ClinVar (clinical assertions)
    │   ├── OMIM (disease associations)
    │   └── gnomAD (population frequency)
    │
    ├── Computational Predictions
    │   ├── SIFT, PolyPhen-2
    │   ├── CADD, REVEL
    │   └── SpliceAI
    │
    ├── ACMG Classification
    │   └── Pathogenic → Likely Pathogenic → VUS → Likely Benign → Benign
    │
    └── Prioritized Variant List

ClinVar Annotation

Download ClinVar

bash
wget https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz
wget https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz.tbi

Annotate with bcftools

bash
bcftools annotate \
    -a clinvar.vcf.gz \
    -c INFO/CLNSIG,INFO/CLNDN,INFO/CLNREVSTAT \
    input.vcf.gz -Oz -o with_clinvar.vcf.gz

Filter Pathogenic Variants

bash
# Pathogenic or Likely pathogenic
bcftools view -i 'INFO/CLNSIG~"Pathogenic" || INFO/CLNSIG~"Likely_pathogenic"' \
    with_clinvar.vcf.gz -Oz -o pathogenic.vcf.gz

# Exclude benign
bcftools view -e 'INFO/CLNSIG~"Benign" || INFO/CLNSIG~"Likely_benign"' \
    with_clinvar.vcf.gz -Oz -o not_benign.vcf.gz

ClinVar Significance Levels

CLNSIG Meaning Action
Pathogenic Disease-causing Report
Likely_pathogenic Probably disease-causing Report with caveat
Uncertain_significance VUS May report, needs follow-up
Likely_benign Probably not disease-causing Usually exclude
Benign Not disease-causing Exclude
Conflicting Multiple interpretations Manual review

ClinVar Review Status

CLNREVSTAT Stars Meaning
practice_guideline 4 Expert panel reviewed
reviewed_by_expert_panel 3 ClinGen expert reviewed
criteria_provided,_multiple_submitters 2 Consistent assertions
criteria_provided,_single_submitter 1 One submitter with criteria
no_assertion_criteria 0 No criteria provided
bash
# Filter for high-confidence assertions (2+ stars)
bcftools view -i 'INFO/CLNREVSTAT~"multiple_submitters" || \
    INFO/CLNREVSTAT~"expert_panel" || \
    INFO/CLNREVSTAT~"practice_guideline"' \
    with_clinvar.vcf.gz -Oz -o high_confidence.vcf.gz

InterVar (ACMG Classification)

Automated ACMG/AMP variant classification.

Installation

bash
git clone https://github.com/WGLab/InterVar.git
cd InterVar
# Download databases per documentation

Run InterVar

bash
python Intervar.py \
    -i input.avinput \
    -o output \
    -b hg38 \
    -d humandb/ \
    --input_type=AVinput

From VCF

bash
# Convert VCF to ANNOVAR format
convert2annovar.pl -format vcf4 input.vcf > input.avinput

# Run InterVar
python Intervar.py -i input.avinput -o intervar_results -b hg38

ACMG/AMP Criteria

Pathogenic Criteria

Code Type Description
PVS1 Very Strong Null variant in gene where LOF is disease mechanism
PS1-4 Strong Same AA change, functional studies, etc.
PM1-6 Moderate Hot spot, absent from controls, etc.
PP1-5 Supporting Co-segregation, computational evidence

Benign Criteria

Code Type Description
BA1 Stand-alone AF >5% in gnomAD
BS1-4 Strong AF greater than expected, functional studies
BP1-7 Supporting Missense in gene with truncating mechanism

Population Frequency Filtering

bash
# Rare variants only (gnomAD AF < 0.01)
bcftools view -i 'INFO/gnomAD_AF<0.01 || INFO/gnomAD_AF="."' \
    input.vcf.gz -Oz -o rare.vcf.gz

# Ultra-rare for dominant diseases (AF < 0.0001)
bcftools view -i 'INFO/gnomAD_AF<0.0001 || INFO/gnomAD_AF="."' \
    input.vcf.gz -Oz -o ultrarare.vcf.gz

Pathogenicity Score Filtering

CADD Scores

bash
# CADD > 20 (top 1% deleterious)
bcftools view -i 'INFO/CADD_PHRED>20' input.vcf.gz -Oz -o cadd_filtered.vcf.gz

# CADD > 30 (top 0.1%)
bcftools view -i 'INFO/CADD_PHRED>30' input.vcf.gz -Oz -o highly_deleterious.vcf.gz

REVEL Scores

bash
# REVEL > 0.5 (likely pathogenic)
bcftools view -i 'INFO/REVEL>0.5' input.vcf.gz -Oz -o revel_filtered.vcf.gz

Combined Filtering

bash
bcftools view -i '(INFO/CADD_PHRED>20 || INFO/REVEL>0.5) && \
    (INFO/CLNSIG~"Pathogenic" || INFO/CLNSIG~"Likely" || INFO/CLNSIG=".")' \
    input.vcf.gz -Oz -o prioritized.vcf.gz

Python: Clinical Prioritization

python
from cyvcf2 import VCF, Writer

def classify_variant(variant):
    clnsig = variant.INFO.get('CLNSIG', '')
    af = variant.INFO.get('gnomAD_AF', 0) or 0
    cadd = variant.INFO.get('CADD_PHRED', 0) or 0
    revel = variant.INFO.get('REVEL', 0) or 0

    # Known pathogenic
    if 'Pathogenic' in str(clnsig):
        return 'PATHOGENIC'
    if 'Likely_pathogenic' in str(clnsig):
        return 'LIKELY_PATHOGENIC'

    # Known benign
    if 'Benign' in str(clnsig) or af > 0.05:
        return 'BENIGN'

    # Computational prediction
    if cadd > 25 or revel > 0.7:
        if af < 0.0001:
            return 'LIKELY_PATHOGENIC'
        elif af < 0.01:
            return 'VUS_FAVOR_PATH'

    if cadd < 10 and revel < 0.3:
        return 'LIKELY_BENIGN'

    return 'VUS'

vcf = VCF('annotated.vcf.gz')
results = []

for variant in vcf:
    classification = classify_variant(variant)
    if classification in ('PATHOGENIC', 'LIKELY_PATHOGENIC', 'VUS_FAVOR_PATH'):
        gene = variant.INFO.get('SYMBOL', 'Unknown')
        consequence = variant.INFO.get('Consequence', 'Unknown')
        results.append({
            'chrom': variant.CHROM,
            'pos': variant.POS,
            'ref': variant.REF,
            'alt': variant.ALT[0],
            'gene': gene,
            'consequence': consequence,
            'classification': classification,
            'clnsig': variant.INFO.get('CLNSIG', '.'),
            'cadd': variant.INFO.get('CADD_PHRED', '.'),
            'af': variant.INFO.get('gnomAD_AF', '.')
        })

# Output prioritized variants
for r in results:
    print(f"{r['gene']}\t{r['chrom']}:{r['pos']}\t{r['consequence']}\t{r['classification']}")

Gene Panel Filtering

bash
# Filter to gene panel
bcftools view -R gene_panel.bed input.vcf.gz -Oz -o panel_variants.vcf.gz

# Or by gene symbol (requires VEP annotation)
bcftools view -i 'INFO/CSQ~"BRCA1" || INFO/CSQ~"BRCA2"' \
    input.vcf.gz -Oz -o brca_variants.vcf.gz

Disease-Specific Resources

Resource Content Use
ClinVar Clinical assertions Primary lookup
OMIM Gene-disease relationships Gene prioritization
HGMD Published mutations Literature evidence
gnomAD Population frequencies Rarity filtering
ClinGen Gene validity/dosage LOF interpretation

Reporting Template

bash
bcftools query -f '%CHROM\t%POS\t%REF\t%ALT\t%INFO/SYMBOL\t%INFO/Consequence\t\
%INFO/CLNSIG\t%INFO/CLNDN\t%INFO/gnomAD_AF\t%INFO/CADD_PHRED\n' \
    prioritized.vcf.gz > clinical_report.tsv

Complete Workflow

bash
#!/bin/bash
set -euo pipefail

INPUT=$1
CLINVAR=$2
OUTPUT_PREFIX=$3

echo "=== Add ClinVar annotations ==="
bcftools annotate -a $CLINVAR \
    -c INFO/CLNSIG,INFO/CLNDN,INFO/CLNREVSTAT,INFO/CLNVC \
    $INPUT -Oz -o ${OUTPUT_PREFIX}_clinvar.vcf.gz

echo "=== Filter rare variants ==="
bcftools view -i 'INFO/gnomAD_AF<0.01 || INFO/gnomAD_AF="."' \
    ${OUTPUT_PREFIX}_clinvar.vcf.gz -Oz -o ${OUTPUT_PREFIX}_rare.vcf.gz

echo "=== Extract pathogenic/likely pathogenic ==="
bcftools view -i 'INFO/CLNSIG~"athogenic"' \
    ${OUTPUT_PREFIX}_rare.vcf.gz -Oz -o ${OUTPUT_PREFIX}_pathogenic.vcf.gz

echo "=== Extract high-impact VUS ==="
bcftools view -i 'INFO/CLNSIG~"Uncertain" && INFO/CADD_PHRED>20' \
    ${OUTPUT_PREFIX}_rare.vcf.gz -Oz -o ${OUTPUT_PREFIX}_vus_review.vcf.gz

echo "=== Generate report ==="
bcftools query -H -f '%CHROM\t%POS\t%REF\t%ALT\t%INFO/SYMBOL\t%INFO/Consequence\t\
%INFO/CLNSIG\t%INFO/CLNDN\t%INFO/gnomAD_AF\t%INFO/CADD_PHRED\n' \
    ${OUTPUT_PREFIX}_pathogenic.vcf.gz > ${OUTPUT_PREFIX}_report.tsv

echo "=== Complete ==="
echo "Pathogenic: ${OUTPUT_PREFIX}_pathogenic.vcf.gz"
echo "VUS for review: ${OUTPUT_PREFIX}_vus_review.vcf.gz"
echo "Report: ${OUTPUT_PREFIX}_report.tsv"

Related Skills

  • variant-calling/variant-annotation - VEP/SnpEff annotation
  • variant-calling/filtering-best-practices - Quality filtering
  • database-access/entrez-fetch - Download ClinVar/OMIM data
  • pathway-analysis/go-enrichment - Gene set analysis

Didn't find tool you were looking for?

Be as detailed as possible for better results