Agent skill

popeve-variant-predictor-agent

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/popeve-variant-predictor-agent

SKILL.md

---name: popeve-variant-predictor-agent description: AI-powered genetic variant pathogenicity prediction using PopEVE deep learning model for population-aware disease variant identification and rare disease diagnosis. license: MIT metadata: author: AI Group version: "1.0.0" created: "2026-01-20" compatibility:

  • system: Python 3.10+ allowed-tools:
  • run_shell_command
  • read_file
  • write_file

keywords:

  • popeve-variant-predictor-agent
  • automation
  • biomedical measurable_outcome: execute task with >95% success rate. ---"

PopEVE Variant Predictor Agent

The PopEVE Variant Predictor Agent leverages the PopEVE deep learning model from Harvard Medical School to predict pathogenicity of genetic variants. PopEVE analyzes evolutionary conservation, protein structure, and population frequency to identify disease-causing variants, having identified over 100 previously unrecognized variants responsible for undiagnosed rare genetic diseases.

When to Use This Skill

  • When predicting pathogenicity of missense variants genome-wide.
  • For rare disease diagnosis with variants of uncertain significance (VUS).
  • To prioritize candidate variants in exome/genome sequencing.
  • When interpreting novel variants not in ClinVar or literature.
  • For population-stratified variant interpretation.

Core Capabilities

  1. Pathogenicity Prediction: Score any missense variant for disease likelihood.

  2. VUS Resolution: Reclassify variants of uncertain significance.

  3. Rare Disease Diagnosis: Identify causal variants in undiagnosed patients.

  4. Population-Aware Scoring: Account for ancestry-specific variant frequencies.

  5. Protein Context Analysis: Integrate structural and functional domains.

  6. Batch Variant Scoring: Process thousands of variants efficiently.

Model Architecture

Component Description Data Source
Evolutionary Module Deep sequence alignment UniRef90, 250M seqs
Structural Module AlphaFold2 structures 200M+ structures
Population Module gnomAD frequencies 800K+ individuals
Clinical Module ClinVar training 100K+ classifications
Integration Multi-task neural network Combined features

Scoring Thresholds

PopEVE Score Interpretation Suggested Action
> 0.9 Likely Pathogenic High priority
0.7 - 0.9 Possibly Pathogenic Review carefully
0.3 - 0.7 Uncertain Additional evidence needed
0.1 - 0.3 Possibly Benign Lower priority
< 0.1 Likely Benign Deprioritize

Workflow

  1. Input: VCF file, gene list, or individual variants.

  2. Annotation: Map variants to transcripts and proteins.

  3. Feature Extraction: Compute evolutionary, structural, population features.

  4. Prediction: Run PopEVE model for pathogenicity scores.

  5. Population Adjustment: Apply ancestry-specific calibration.

  6. Ranking: Prioritize variants by score and gene relevance.

  7. Output: Scored variants with interpretations.

Example Usage

User: "Score all missense variants from this rare disease patient's exome to identify potential causal variants."

Agent Action:

bash
python3 Skills/Genomics/PopEVE_Variant_Predictor_Agent/popeve_predict.py \
    --vcf patient_exome.vcf \
    --genome GRCh38 \
    --ancestry EUR \
    --gene_panel rare_disease_genes.txt \
    --min_score 0.5 \
    --output pathogenicity_scores.tsv

Input Formats

Format Description Example
VCF Standard variant calls patient.vcf.gz
TSV Simple variant list chr, pos, ref, alt
HGVS Protein notation NP_000546.1:p.Arg248Gln
Gene + Position Gene-centric TP53:R248Q

Output Components

Column Description
Variant Genomic/protein notation
PopEVE_Score 0-1 pathogenicity score
Classification Benign/VUS/Pathogenic
Confidence Prediction confidence
EVE_Score Evolutionary component
Structure_Score Structural impact
Population_AF Population frequency
Gene Affected gene
Domain Protein domain affected
ClinVar Existing classification if any

Comparison with Other Tools

Tool PopEVE Advantage
SIFT/PolyPhen More accurate, deep learning
CADD Population-aware, less bias
REVEL Better rare variant handling
AlphaMissense Complimentary; can ensemble
ClinVar Scores novel variants

AI/ML Components

Deep Learning Architecture:

  • Transformer for sequence context
  • Graph neural network for structure
  • Variational autoencoder for evolution
  • Gradient boosting for integration

Training Strategy:

  • Semi-supervised with ClinVar labels
  • Evolutionary likelihood (unsupervised)
  • Population frequency calibration
  • Cross-validation across genes

Population Modeling:

  • Ancestry-specific allele frequencies
  • Selection coefficient estimation
  • Demographic history modeling

Performance Metrics

Metric PopEVE AlphaMissense REVEL
AUROC (ClinVar) 0.95 0.94 0.92
AUROC (DMS) 0.89 0.90 0.85
VUS Resolution 45% 40% 35%
Cross-ancestry 0.93 0.91 0.88

Prerequisites

  • Python 3.10+
  • PyTorch, transformers
  • PopEVE model weights
  • Reference genome (GRCh37/38)
  • VEP or similar annotator
  • gnomAD database access

Related Skills

  • AlphaMissense_Agent - For AlphaMissense predictions
  • DiagAI_Agent - For clinical diagnosis support
  • ACMG_Classifier_Agent - For ACMG classification
  • Pharmacogenomics_Agent - For drug-gene variants

Disease Categories

Category Example Genes PopEVE Performance
Cardiomyopathy MYH7, MYBPC3 Excellent
Neurological SCN1A, KCNQ2 Excellent
Cancer Predisposition BRCA1, TP53 Good-Excellent
Metabolic PAH, CFTR Good
Immunodeficiency BTK, WAS Good

Special Considerations

  1. Gene Coverage: Best for well-conserved genes with orthologs
  2. Protein-Coding Only: Missense variants in coding regions
  3. Novel Genes: Lower confidence for poorly characterized genes
  4. Ancestry Calibration: Use appropriate population reference
  5. Ensemble Approach: Combine with AlphaMissense for best results

Clinical Integration

Step Action
1 Run PopEVE on all coding variants
2 Filter by phenotype-relevant genes
3 Rank by PopEVE score
4 Review top candidates
5 Apply ACMG criteria with PopEVE as evidence
6 Validate with functional studies if available

Author

AI Group - Biomedical AI Platform

Didn't find tool you were looking for?

Be as detailed as possible for better results