Agent skill
bio-differential-expression-timeseries-de
Analyze time-series RNA-seq data using limma voom with splines, maSigPro, and ImpulseDE2. Identify genes with dynamic expression patterns. Use when analyzing time-series or longitudinal expression data.
Install this agent skill to your Project
npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-differential-expression-timeseries-de
SKILL.md
Version Compatibility
Reference examples tested with: DESeq2 1.42+, edgeR 4.0+, ggplot2 3.5+, limma 3.58+, scanpy 1.10+
Before using code patterns, verify installed versions match. If versions differ:
- R:
packageVersion('<pkg>')then?function_nameto verify parameters
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Time-Series Differential Expression
Identify genes with significant temporal expression patterns in time-course experiments.
Approaches
| Method | Best For |
|---|---|
| limma with splines | Smooth temporal patterns |
| maSigPro | Multiple time points, regression |
| ImpulseDE2 | Impulse-like patterns |
| DESeq2 LRT | Discrete time comparisons |
limma with Splines
Goal: Identify genes with smooth temporal expression patterns using flexible spline models.
Approach: Fit voom-transformed counts with natural spline basis functions in limma, testing spline coefficients for significance.
"Find genes that change over time in my RNA-seq experiment" → Model temporal expression using spline regression and test whether spline terms are significantly non-zero.
Setup
library(limma)
library(edgeR)
library(splines)
# Load count data
counts <- read.table('counts.txt', header=TRUE, row.names=1)
metadata <- read.table('metadata.txt', header=TRUE)
# metadata should have: sample, time, condition, replicate
Basic Time-Series Model
# Create DGEList
dge <- DGEList(counts=counts)
dge <- calcNormFactors(dge)
# Filter low counts
keep <- filterByExpr(dge, group=metadata$condition)
dge <- dge[keep, , keep.lib.sizes=FALSE]
# Design with natural splines
time <- metadata$time
design <- model.matrix(~ ns(time, df=3))
# voom transformation
v <- voom(dge, design, plot=TRUE)
# Fit model
fit <- lmFit(v, design)
fit <- eBayes(fit)
# Test for any temporal effect (all spline terms)
results <- topTable(fit, coef=2:4, number=Inf)
Two Conditions Over Time
# Design for condition-specific time effects
condition <- factor(metadata$condition)
time <- metadata$time
# Interaction model
design <- model.matrix(~ condition * ns(time, df=3))
v <- voom(dge, design, plot=TRUE)
fit <- lmFit(v, design)
fit <- eBayes(fit)
# Genes with different temporal patterns between conditions
# Test interaction terms
results_interaction <- topTable(fit, coef=grep(':', colnames(design)), number=Inf)
Contrasts for Specific Comparisons
# Compare time points within condition
design <- model.matrix(~ 0 + condition:factor(time))
colnames(design) <- gsub(':', '_', colnames(design))
v <- voom(dge, design)
fit <- lmFit(v, design)
# Contrast: Treated_T2 vs Treated_T0
contrast <- makeContrasts(
early_response = ConditionTreated_time2 - ConditionTreated_time0,
late_response = ConditionTreated_time6 - ConditionTreated_time0,
levels = design
)
fit2 <- contrasts.fit(fit, contrast)
fit2 <- eBayes(fit2)
results <- topTable(fit2, coef='early_response', number=Inf)
maSigPro
Goal: Identify genes with significant temporal expression profiles using two-step polynomial regression.
Approach: Apply global regression to find time-variable genes, then stepwise regression to refine significant profiles and cluster them.
Installation
BiocManager::install('maSigPro')
Two-Step Regression
library(maSigPro)
# Create experimental design
# Time, Replicate, Group columns required
edesign <- data.frame(
Time = metadata$time,
Replicate = metadata$replicate,
Control = as.numeric(metadata$condition == 'Control'),
Treatment = as.numeric(metadata$condition == 'Treatment')
)
rownames(edesign) <- metadata$sample
# Normalize counts
dge <- DGEList(counts=counts)
dge <- calcNormFactors(dge)
norm_counts <- cpm(dge, log=TRUE)
# Create design matrix for polynomial regression
design <- make.design.matrix(edesign, degree=3)
# Step 1: Global regression (find time-variable genes)
fit <- p.vector(norm_counts, design, Q=0.05, MT.adjust='BH')
# Step 2: Stepwise regression (find significant profiles)
tstep <- T.fit(fit, step.method='backward', alfa=0.05)
# Get significant genes
sigs <- get.siggenes(tstep, rsq=0.6, vars='groups')
# Visualize clusters
see.genes(sigs$sig.genes, show.fit=TRUE, dis=design$dis,
cluster.method='hclust', k=9)
Cluster Visualization
# Plot specific clusters
pdf('timeseries_clusters.pdf', width=12, height=10)
see.genes(sigs$sig.genes, show.fit=TRUE, dis=design$dis,
cluster.method='hclust', k=9,
newX11=FALSE)
dev.off()
# Get genes per cluster
cluster_genes <- sigs$sig.genes$sig.profiles
ImpulseDE2
Goal: Detect genes with transient impulse-like expression patterns (rise then fall, or vice versa).
Approach: Fit sigmoid-based impulse models to each gene and test for significant temporal dynamics.
Installation
BiocManager::install('ImpulseDE2')
Run ImpulseDE2
library(ImpulseDE2)
library(DESeq2)
# Create annotation
dfAnnotation <- data.frame(
Sample = colnames(counts),
Time = metadata$time,
Condition = metadata$condition,
Batch = metadata$batch
)
# Run ImpulseDE2
impulse_results <- runImpulseDE2(
matCountData = as.matrix(counts),
dfAnnotation = dfAnnotation,
boolCaseCtrl = TRUE,
vecConfounders = c('Batch'),
scaNProc = 4
)
# Get significant genes
sig_genes <- impulse_results$dfImpulseDE2Results[
impulse_results$dfImpulseDE2Results$padj < 0.05, ]
DESeq2 Likelihood Ratio Test
Goal: Test for any temporal effect across discrete time points without assuming a smooth curve.
Approach: Compare a full model with time terms against a reduced model using a likelihood ratio test.
library(DESeq2)
# Design with time as factor
dds <- DESeqDataSetFromMatrix(
countData = counts,
colData = metadata,
design = ~ condition + time + condition:time
)
# LRT: test if time has any effect
dds_lrt <- DESeq(dds, test='LRT', reduced = ~ condition)
results_lrt <- results(dds_lrt)
# Genes with significant temporal pattern
sig_time <- results_lrt[results_lrt$padj < 0.05 & !is.na(results_lrt$padj), ]
Visualization
Goal: Display temporal expression trajectories for top significant genes across conditions.
Approach: Plot per-gene expression over time with loess smoothing, faceted or as a grid of individual gene plots.
Expression Profiles
library(ggplot2)
plot_gene_timeseries <- function(gene, counts, metadata) {
gene_data <- data.frame(
time = metadata$time,
condition = metadata$condition,
expression = as.numeric(counts[gene, ])
)
ggplot(gene_data, aes(time, expression, color = condition, group = condition)) +
geom_point(size = 2) +
geom_smooth(method = 'loess', se = TRUE, alpha = 0.2) +
labs(title = gene, x = 'Time', y = 'log2(CPM)') +
theme_bw()
}
# Plot top genes
top_genes <- head(rownames(results)[order(results$adj.P.Val)], 9)
plots <- lapply(top_genes, plot_gene_timeseries, counts = norm_counts, metadata = metadata)
library(patchwork)
wrap_plots(plots, ncol = 3)
Heatmap with Time Order
library(pheatmap)
# Get significant genes
sig_genes <- rownames(results)[results$adj.P.Val < 0.05]
# Order samples by time
sample_order <- order(metadata$time, metadata$condition)
mat <- norm_counts[sig_genes, sample_order]
# Scale rows
mat_scaled <- t(scale(t(mat)))
# Annotation
anno_col <- data.frame(
Time = metadata$time[sample_order],
Condition = metadata$condition[sample_order],
row.names = colnames(mat)
)
pheatmap(mat_scaled,
annotation_col = anno_col,
cluster_cols = FALSE,
show_rownames = FALSE,
color = colorRampPalette(c('blue', 'white', 'red'))(100))
Complete Workflow
Goal: Run an end-to-end time-series DE analysis combining limma splines and maSigPro.
Approach: Normalize and filter counts, fit both models in parallel, then union the significant gene sets.
library(limma)
library(edgeR)
library(splines)
library(maSigPro)
# Load data
counts <- read.table('counts.txt', header=TRUE, row.names=1)
metadata <- read.table('metadata.txt', header=TRUE, row.names=1)
# Normalize
dge <- DGEList(counts=counts)
dge <- calcNormFactors(dge)
keep <- filterByExpr(dge, group=metadata$condition)
dge <- dge[keep, , keep.lib.sizes=FALSE]
norm_counts <- cpm(dge, log=TRUE)
# Method 1: limma with splines
design <- model.matrix(~ metadata$condition * ns(metadata$time, df=3))
v <- voom(dge, design, plot=TRUE)
fit <- lmFit(v, design)
fit <- eBayes(fit)
# Genes with condition-specific temporal patterns
interaction_terms <- grep(':', colnames(design))
results_limma <- topTable(fit, coef=interaction_terms, number=Inf)
sig_limma <- rownames(results_limma)[results_limma$adj.P.Val < 0.05]
# Method 2: maSigPro
edesign <- data.frame(
Time = metadata$time,
Replicate = 1:nrow(metadata),
Control = as.numeric(metadata$condition == 'Control'),
Treatment = as.numeric(metadata$condition == 'Treatment')
)
rownames(edesign) <- rownames(metadata)
design_masig <- make.design.matrix(edesign, degree=3)
fit_masig <- p.vector(norm_counts, design_masig, Q=0.05)
tstep <- T.fit(fit_masig, step.method='backward')
sigs <- get.siggenes(tstep, rsq=0.6, vars='groups')
# Combine results
all_sig <- union(sig_limma, rownames(sigs$sig.genes$sig.profiles))
cat('Significant genes:', length(all_sig), '\n')
Related Skills
- differential-expression/deseq2-basics - Standard DE analysis
- differential-expression/de-visualization - Visualize results
- differential-expression/batch-correction - Handle batch effects
- pathway-analysis/go-enrichment - Functional analysis of clusters
- temporal-genomics/circadian-rhythms - Circadian rhythm detection for time-course data
- temporal-genomics/temporal-clustering - Cluster genes by temporal expression profile
- temporal-genomics/trajectory-modeling - GAM trajectory fitting for temporal expression data
- temporal-genomics/temporal-grn - Dynamic GRN inference from bulk time-series data
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
bio-alignment-io
Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
bio-hi-c-analysis-matrix-operations
Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.
Didn't find tool you were looking for?