Agent skill

scvi-tools

This skill should be used when working with single-cell omics data analysis using scvi-tools, including scRNA-seq, scATAC-seq, CITE-seq, spatial transcriptomics, and other single-cell modalities. Use this skill for probabilistic modeling, batch correction, dimensionality reduction, differential expression, cell type annotation, multimodal integration, and spatial analysis tasks.

View SKILL.md on GitHub Repository

Stars 2,009

Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/scvi-tools

SKILL.md

scvi-tools

Overview

scvi-tools is a comprehensive Python framework for probabilistic models in single-cell genomics. Built on PyTorch and PyTorch Lightning, it provides deep generative models using variational inference for analyzing diverse single-cell data modalities.

When to Use This Skill

Use this skill when:

Analyzing single-cell RNA-seq data (dimensionality reduction, batch correction, integration)
Working with single-cell ATAC-seq or chromatin accessibility data
Integrating multimodal data (CITE-seq, multiome, paired/unpaired datasets)
Analyzing spatial transcriptomics data (deconvolution, spatial mapping)
Performing differential expression analysis on single-cell data
Conducting cell type annotation or transfer learning tasks
Working with specialized single-cell modalities (methylation, cytometry, RNA velocity)
Building custom probabilistic models for single-cell analysis

Core Capabilities

scvi-tools provides models organized by data modality:

1. Single-Cell RNA-seq Analysis

Core models for expression analysis, batch correction, and integration. See references/models-scrna-seq.md for:

scVI: Unsupervised dimensionality reduction and batch correction
scANVI: Semi-supervised cell type annotation and integration
AUTOZI: Zero-inflation detection and modeling
VeloVI: RNA velocity analysis
contrastiveVI: Perturbation effect isolation

2. Chromatin Accessibility (ATAC-seq)

Models for analyzing single-cell chromatin data. See references/models-atac-seq.md for:

PeakVI: Peak-based ATAC-seq analysis and integration
PoissonVI: Quantitative fragment count modeling
scBasset: Deep learning approach with motif analysis

3. Multimodal & Multi-omics Integration

Joint analysis of multiple data types. See references/models-multimodal.md for:

totalVI: CITE-seq protein and RNA joint modeling
MultiVI: Paired and unpaired multi-omic integration
MrVI: Multi-resolution cross-sample analysis

4. Spatial Transcriptomics

Spatially-resolved transcriptomics analysis. See references/models-spatial.md for:

DestVI: Multi-resolution spatial deconvolution
Stereoscope: Cell type deconvolution
Tangram: Spatial mapping and integration
scVIVA: Cell-environment relationship analysis

5. Specialized Modalities

Additional specialized analysis tools. See references/models-specialized.md for:

MethylVI/MethylANVI: Single-cell methylation analysis
CytoVI: Flow/mass cytometry batch correction
Solo: Doublet detection
CellAssign: Marker-based cell type annotation

Typical Workflow

All scvi-tools models follow a consistent API pattern:

python

# 1. Load and preprocess data (AnnData format)
import scvi
import scanpy as sc

adata = scvi.data.heart_cell_atlas_subsampled()
sc.pp.filter_genes(adata, min_counts=3)
sc.pp.highly_variable_genes(adata, n_top_genes=1200)

# 2. Register data with model (specify layers, covariates)
scvi.model.SCVI.setup_anndata(
    adata,
    layer="counts",  # Use raw counts, not log-normalized
    batch_key="batch",
    categorical_covariate_keys=["donor"],
    continuous_covariate_keys=["percent_mito"]
)

# 3. Create and train model
model = scvi.model.SCVI(adata)
model.train()

# 4. Extract latent representations and normalized values
latent = model.get_latent_representation()
normalized = model.get_normalized_expression(library_size=1e4)

# 5. Store in AnnData for downstream analysis
adata.obsm["X_scVI"] = latent
adata.layers["scvi_normalized"] = normalized

# 6. Downstream analysis with scanpy
sc.pp.neighbors(adata, use_rep="X_scVI")
sc.tl.umap(adata)
sc.tl.leiden(adata)

Key Design Principles:

Raw counts required: Models expect unnormalized count data for optimal performance
Unified API: Consistent interface across all models (setup → train → extract)
AnnData-centric: Seamless integration with the scanpy ecosystem
GPU acceleration: Automatic utilization of available GPUs
Batch correction: Handle technical variation through covariate registration

Common Analysis Tasks

Differential Expression

Probabilistic DE analysis using the learned generative models:

python

de_results = model.differential_expression(
    groupby="cell_type",
    group1="TypeA",
    group2="TypeB",
    mode="change",  # Use composite hypothesis testing
    delta=0.25      # Minimum effect size threshold
)

See references/differential-expression.md for detailed methodology and interpretation.

Model Persistence

Save and load trained models:

python

# Save model
model.save("./model_directory", overwrite=True)

# Load model
model = scvi.model.SCVI.load("./model_directory", adata=adata)

Batch Correction and Integration

Integrate datasets across batches or studies:

python

# Register batch information
scvi.model.SCVI.setup_anndata(adata, batch_key="study")

# Model automatically learns batch-corrected representations
model = scvi.model.SCVI(adata)
model.train()
latent = model.get_latent_representation()  # Batch-corrected

Theoretical Foundations

scvi-tools is built on:

Variational inference: Approximate posterior distributions for scalable Bayesian inference
Deep generative models: VAE architectures that learn complex data distributions
Amortized inference: Shared neural networks for efficient learning across cells
Probabilistic modeling: Principled uncertainty quantification and statistical testing

See references/theoretical-foundations.md for detailed background on the mathematical framework.

Additional Resources

Workflows: references/workflows.md contains common workflows, best practices, hyperparameter tuning, and GPU optimization
Model References: Detailed documentation for each model category in the references/ directory
Official Documentation: https://docs.scvi-tools.org/en/stable/
Tutorials: https://docs.scvi-tools.org/en/stable/tutorials/index.html
API Reference: https://docs.scvi-tools.org/en/stable/api/index.html

Installation

bash

uv pip install scvi-tools
# For GPU support
uv pip install scvi-tools[cuda]

Best Practices

Use raw counts: Always provide unnormalized count data to models
Filter genes: Remove low-count genes before analysis (e.g., min_counts=3)
Register covariates: Include known technical factors (batch, donor, etc.) in setup_anndata
Feature selection: Use highly variable genes for improved performance
Model saving: Always save trained models to avoid retraining
GPU usage: Enable GPU acceleration for large datasets (accelerator="gpu")
Scanpy integration: Store outputs in AnnData objects for downstream analysis

Maintainer

FreedomIntelligence Core maintainer

Source details

Full Name: FreedomIntelligence/OpenClaw-Medical-Skills
Branch: main
Path in repo: skills/scvi-tools
Topics: claude-code skills openclaw awesome clawhub openclaw-skills medical nanoclaw

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量，并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

scvi-tools

Overview

When to Use This Skill

Core Capabilities

1. Single-Cell RNA-seq Analysis

2. Chromatin Accessibility (ATAC-seq)

3. Multimodal & Multi-omics Integration

4. Spatial Transcriptomics

5. Specialized Modalities

Typical Workflow

Common Analysis Tasks

Differential Expression

Model Persistence

Batch Correction and Integration

Theoretical Foundations

Additional Resources

Installation

Best Practices

Recommended Agent Skills

vcf-annotator

chemist-analyst

bio-alignment-io

sleep-analyzer

metabolomics-workbench-database

bio-hi-c-analysis-matrix-operations