Agent skill
tooluniverse-proteomics-analysis
Analyze mass spectrometry proteomics data including protein quantification, differential expression, post-translational modifications (PTMs), and protein-protein interactions. Processes MaxQuant, Spectronaut, DIA-NN, and other MS platform outputs. Performs normalization, statistical analysis, pathway enrichment, and integration with transcriptomics. Use when analyzing proteomics data, comparing protein abundance between conditions, identifying PTM changes, studying protein complexes, integrating protein and RNA data, discovering protein biomarkers, or conducting quantitative proteomics experiments.
Install this agent skill to your Project
npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/tooluniverse-proteomics-analysis
SKILL.md
Proteomics Analysis
Comprehensive analysis of mass spectrometry-based proteomics data from protein identification through quantification, differential expression, post-translational modifications, and systems-level interpretation.
When to Use This Skill
Triggers:
- User has proteomics data (MS output files)
- Questions about protein abundance or expression
- Differential protein expression analysis requests
- PTM analysis (phosphorylation, acetylation, ubiquitination)
- Protein-RNA correlation analysis
- Multi-omics integration involving proteomics
- Protein complex or interaction analysis
- Proteomics biomarker discovery
Example Questions This Skill Solves:
- "Analyze this MaxQuant output for differential protein expression"
- "Which proteins are significantly upregulated in disease vs control?"
- "Correlate protein abundance with mRNA expression"
- "What post-translational modifications change between conditions?"
- "Identify protein complexes in my co-IP MS data"
- "Which pathways are enriched in differentially expressed proteins?"
- "Find protein biomarkers for disease classification"
- "Compare protein and RNA levels to identify translation-regulated genes"
Core Capabilities
| Capability | Description |
|---|---|
| Data Import | MaxQuant, Spectronaut, DIA-NN, Proteome Discoverer, FragPipe outputs |
| Quality Control | Missing value analysis, intensity distributions, sample clustering |
| Normalization | Median, quantile, TMM, VSN normalization methods |
| Imputation | MinProb, KNN, QRILC for missing values |
| Differential Expression | Limma, DEP, MSstats for statistical testing |
| PTM Analysis | Phospho-site localization, PTM enrichment, kinase prediction |
| Protein-RNA Integration | Correlation analysis, translation efficiency |
| Pathway Enrichment | Over-representation and GSEA for protein sets |
| PPI Analysis | Protein complex detection, interaction networks via STRING/IntAct |
| Reporting | Comprehensive reports with volcano plots, heatmaps, pathway diagrams |
Workflow Overview
Input: MS Proteomics Data
|
v
Phase 1: Data Import & QC
|-- Load MaxQuant/Spectronaut/DIA-NN output
|-- Parse protein groups, intensities, modifications
|-- Quality control plots (missing values, intensity distributions)
|-- Sample correlation and PCA
|
v
Phase 2: Preprocessing
|-- Filter low-confidence proteins
|-- Handle missing values (imputation or filtering)
|-- Log-transform intensities
|-- Normalize across samples
|
v
Phase 3: Differential Expression Analysis
|-- Statistical testing (limma, t-test, ANOVA)
|-- Multiple testing correction (BH, Bonferroni)
|-- Fold change calculation
|-- Significance thresholds (p < 0.05, |log2FC| > 1)
|
v
Phase 4: PTM Analysis (if applicable)
|-- Identify modified peptides
|-- Localization probability filtering
|-- PTM site quantification
|-- Kinase-substrate prediction
|-- PTM enrichment analysis
|
v
Phase 5: Functional Enrichment
|-- Gene Ontology enrichment
|-- KEGG/Reactome pathway enrichment
|-- Protein complex enrichment (CORUM)
|-- Tissue-specific enrichment
|
v
Phase 6: Protein-Protein Interactions
|-- Query STRING for interaction networks
|-- Identify protein complexes
|-- Network clustering and modules
|-- Hub protein identification
|
v
Phase 7: Multi-Omics Integration (optional)
|-- Correlate with RNA-seq data
|-- Identify translation-regulated proteins
|-- Compare with variant/CNV data
|-- Integrate with metabolomics
|
v
Phase 8: Generate Report
|-- Summary statistics
|-- Volcano plots and heatmaps
|-- Pathway diagrams
|-- Protein network visualizations
|-- Multi-omics integration plots
Phase Details
Phase 1: Data Import & Quality Control
Objective: Load proteomics data and assess data quality.
Supported input formats:
MaxQuant (most common):
proteinGroups.txt- Protein-level quantificationevidence.txt- Peptide-level dataPhospho (STY)Sites.txt- Phosphorylation sitesmodificationSpecificPeptides.txt- Other PTMs
Spectronaut:
*_Report.tsv- Protein/peptide quantification- DIA-based quantification
DIA-NN:
report.tsv- Protein groupsreport.pr_matrix.tsv- Protein matrix
Proteome Discoverer:
*_Proteins.txt*_PSMs.txt
Data loading:
def load_maxquant_proteins(protein_groups_file):
"""
Load MaxQuant proteinGroups.txt file.
Returns:
- DataFrame with proteins as rows, samples as columns
- Metadata (protein names, gene names, sequence coverage)
"""
import pandas as pd
# Read file
df = pd.read_csv(protein_groups_file, sep='\t')
# Extract intensity columns (LFQ or raw)
intensity_cols = [col for col in df.columns if 'LFQ intensity' in col or 'Intensity ' in col]
# Create intensity matrix
intensity_matrix = df[intensity_cols].copy()
intensity_matrix.columns = [col.replace('LFQ intensity ', '').replace('Intensity ', '')
for col in intensity_cols]
# Add protein metadata
metadata = df[['Protein IDs', 'Gene names', 'Fasta headers',
'Peptides', 'Sequence coverage [%]']].copy()
return intensity_matrix, metadata
Quality Control:
- Missing value assessment:
def assess_missing_values(intensity_matrix):
"""
Calculate percentage of missing values per protein and sample.
"""
# Per protein
missing_per_protein = (intensity_matrix == 0).sum(axis=1) / intensity_matrix.shape[1]
# Per sample
missing_per_sample = (intensity_matrix == 0).sum(axis=0) / intensity_matrix.shape[0]
# Visualize
plot_missing_value_heatmap(intensity_matrix)
return missing_per_protein, missing_per_sample
- Intensity distribution:
def plot_intensity_distributions(intensity_matrix):
"""
Plot log10 intensity distributions per sample.
Check for consistent distributions across samples.
"""
import matplotlib.pyplot as plt
import numpy as np
log_intensities = np.log10(intensity_matrix.replace(0, np.nan))
# Boxplot per sample
log_intensities.plot(kind='box')
plt.ylabel('log10 Intensity')
plt.title('Intensity Distribution per Sample')
# Should see similar median and spread across samples
- Sample correlation:
def plot_sample_correlation(intensity_matrix):
"""
Calculate and visualize sample-sample correlation.
Expect: High correlation within replicates, lower between conditions.
"""
# Log-transform and remove zeros
log_data = np.log2(intensity_matrix.replace(0, np.nan))
# Correlation matrix
corr_matrix = log_data.corr(method='pearson')
# Heatmap
import seaborn as sns
sns.heatmap(corr_matrix, annot=True, cmap='RdYlBu_r', vmin=0.8, vmax=1.0)
- PCA:
def perform_pca(intensity_matrix, sample_groups):
"""
Principal component analysis for sample clustering.
"""
from sklearn.decomposition import PCA
# Prepare data (log, impute, scale)
log_data = np.log2(intensity_matrix.replace(0, np.nan))
# Simple imputation with minimum value
imputed = log_data.fillna(log_data.min().min())
# PCA
pca = PCA(n_components=2)
pca_result = pca.fit_transform(imputed.T)
# Plot with group colors
plt.scatter(pca_result[:, 0], pca_result[:, 1], c=sample_groups)
plt.xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.1%})')
plt.ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.1%})')
Phase 2: Preprocessing & Normalization
Objective: Clean data and normalize across samples for fair comparison.
Filtering:
def filter_proteins(intensity_matrix, metadata, min_valid=3):
"""
Filter out low-confidence proteins.
Criteria:
- At least 2 unique peptides (from metadata)
- At least min_valid samples with detected intensity
- Remove contaminants and reverse sequences
"""
# Filter by peptide count
valid_proteins = metadata['Peptides'] >= 2
# Filter by detection in samples
n_detected = (intensity_matrix > 0).sum(axis=1)
valid_detection = n_detected >= min_valid
# Remove contaminants (from MaxQuant)
is_contaminant = metadata['Protein IDs'].str.contains('CON__', na=False)
is_reverse = metadata['Protein IDs'].str.contains('REV__', na=False)
# Combined filter
keep = valid_proteins & valid_detection & ~is_contaminant & ~is_reverse
return intensity_matrix[keep], metadata[keep]
Missing value imputation:
def impute_missing_values(intensity_matrix, method='MinProb'):
"""
Impute missing protein intensities.
Methods:
- MinProb: Random from minimum observed + normal noise (for MNAR)
- KNN: K-nearest neighbors imputation
- QRILC: Quantile regression-based imputation
"""
if method == 'MinProb':
# Assume missing = low abundance (MNAR assumption)
min_val = intensity_matrix[intensity_matrix > 0].min().min()
width = 0.3 # Standard deviation of noise
shift = 1.8 # Downshift from minimum
# Replace zeros with random low values
imputed = intensity_matrix.copy()
missing_mask = imputed == 0
n_missing = missing_mask.sum().sum()
random_vals = np.random.normal(
loc=min_val - shift,
scale=width,
size=n_missing
)
imputed.values[missing_mask.values] = random_vals
return imputed
elif method == 'KNN':
from sklearn.impute import KNNImputer
imputer = KNNImputer(n_neighbors=5)
imputed = pd.DataFrame(
imputer.fit_transform(intensity_matrix.replace(0, np.nan)),
index=intensity_matrix.index,
columns=intensity_matrix.columns
)
return imputed
Normalization:
def normalize_intensities(intensity_matrix, method='median'):
"""
Normalize protein intensities across samples.
Methods:
- median: Divide by median intensity per sample
- quantile: Quantile normalization (same distribution)
- TMM: Trimmed mean of M-values (from edgeR)
- VSN: Variance-stabilizing normalization
"""
if method == 'median':
# Median normalization
medians = intensity_matrix.median(axis=0)
global_median = medians.median()
norm_factors = global_median / medians
normalized = intensity_matrix * norm_factors
return normalized
elif method == 'quantile':
# Quantile normalization
from sklearn.preprocessing import quantile_transform
normalized = pd.DataFrame(
quantile_transform(intensity_matrix, axis=1),
index=intensity_matrix.index,
columns=intensity_matrix.columns
)
return normalized
Phase 3: Differential Expression Analysis
Objective: Identify proteins with significant abundance changes between conditions.
Statistical testing with limma:
def differential_expression_limma(log2_intensities, group1_samples, group2_samples):
"""
Perform differential expression using limma-like approach.
Returns:
- log2 fold changes
- p-values
- adjusted p-values (BH)
"""
from scipy import stats
results = []
for protein in log2_intensities.index:
# Extract intensities for each group
group1 = log2_intensities.loc[protein, group1_samples]
group2 = log2_intensities.loc[protein, group2_samples]
# Calculate statistics
mean1 = group1.mean()
mean2 = group2.mean()
log2fc = mean2 - mean1
# t-test
t_stat, p_value = stats.ttest_ind(group1, group2, equal_var=False)
results.append({
'protein': protein,
'log2FC': log2fc,
'mean_group1': mean1,
'mean_group2': mean2,
'p_value': p_value,
't_statistic': t_stat
})
results_df = pd.DataFrame(results)
# Multiple testing correction (Benjamini-Hochberg)
from statsmodels.stats.multitest import multipletests
results_df['adj_p_value'] = multipletests(results_df['p_value'], method='fdr_bh')[1]
# Classify significance
results_df['significant'] = (
(results_df['adj_p_value'] < 0.05) &
(np.abs(results_df['log2FC']) > 1.0)
)
return results_df
Volcano plot:
def plot_volcano(de_results, title='Volcano Plot'):
"""
Visualize differential expression results.
"""
import matplotlib.pyplot as plt
plt.figure(figsize=(8, 6))
# Non-significant
non_sig = de_results[~de_results['significant']]
plt.scatter(non_sig['log2FC'], -np.log10(non_sig['p_value']),
c='gray', alpha=0.5, s=10)
# Significant
sig = de_results[de_results['significant']]
plt.scatter(sig['log2FC'], -np.log10(sig['p_value']),
c='red', alpha=0.7, s=20)
# Thresholds
plt.axhline(-np.log10(0.05), color='blue', linestyle='--', label='p=0.05')
plt.axvline(-1, color='blue', linestyle='--')
plt.axvline(1, color='blue', linestyle='--', label='|log2FC|=1')
plt.xlabel('log2 Fold Change')
plt.ylabel('-log10(p-value)')
plt.title(title)
plt.legend()
Phase 4: PTM Analysis
Objective: Analyze post-translational modifications (phosphorylation, acetylation, etc.)
Phosphoproteomics workflow:
def analyze_phosphosites(phospho_sites_file, intensity_matrix):
"""
Analyze phosphorylation site changes.
Input: MaxQuant Phospho (STY)Sites.txt
Output: Differential phosphorylation per site
"""
# Load phospho data
phospho = pd.read_csv(phospho_sites_file, sep='\t')
# Filter by localization probability
phospho_confident = phospho[phospho['Localization prob'] > 0.75]
# Extract site information
phospho_confident['site'] = (
phospho_confident['Gene names'] + '_' +
phospho_confident['Amino acid'] +
phospho_confident['Position'].astype(str)
)
# Quantification (similar to protein-level analysis)
# ... perform differential analysis ...
return phospho_results
Kinase-substrate prediction:
def predict_kinases(phospho_sites):
"""
Predict upstream kinases for phosphorylation sites.
Uses ToolUniverse PhosphoSitePlus or KEA3 tools.
"""
from tooluniverse import ToolUniverse
tu = ToolUniverse()
# For each significant phosphosite
kinase_predictions = []
for site in phospho_sites:
# Query kinase-substrate databases
# (would use actual ToolUniverse tool here)
result = tu.run_one_function({
"name": "phosphosite_plus_query", # hypothetical
"arguments": {"site": site}
})
kinase_predictions.append(result)
return kinase_predictions
Phase 5: Functional Enrichment
Objective: Interpret biological meaning of protein changes via pathway analysis.
Gene Ontology enrichment:
def pathway_enrichment_proteins(de_proteins, organism='human'):
"""
Perform pathway enrichment for differentially expressed proteins.
Uses ToolUniverse gene-enrichment skill.
"""
from tooluniverse import ToolUniverse
tu = ToolUniverse()
# Extract gene names for significant proteins
sig_proteins = de_proteins[de_proteins['significant']]
gene_list = sig_proteins['gene_name'].tolist()
# Run enrichment via ToolUniverse
enrichment = tu.run_one_function({
"name": "enrichr_enrich",
"arguments": {
"gene_list": ",".join(gene_list),
"library": "KEGG_2021_Human"
}
})
return enrichment
Protein complex enrichment:
def protein_complex_enrichment(protein_list):
"""
Test for enrichment of known protein complexes (CORUM database).
"""
# Query CORUM or use ToolUniverse
# Identify if proteins are part of known complexes
pass
Phase 6: Protein-Protein Interactions
Objective: Identify interaction networks and protein complexes.
STRING network analysis:
def build_protein_network(protein_list, confidence=0.7):
"""
Build PPI network using STRING database.
Uses ToolUniverse STRING tools.
"""
from tooluniverse import ToolUniverse
tu = ToolUniverse()
# Get interactions
interactions = tu.run_one_function({
"name": "string_get_interactions",
"arguments": {
"proteins": ",".join(protein_list),
"species": 9606, # human
"score_threshold": int(confidence * 1000)
}
})
# Build network graph
import networkx as nx
G = nx.Graph()
for interaction in interactions['data']:
G.add_edge(
interaction['protein1'],
interaction['protein2'],
score=interaction['score']
)
return G
Module detection:
def detect_protein_modules(network_graph):
"""
Identify tightly connected protein modules (complexes).
"""
from networkx.algorithms import community
# Detect communities
communities = community.greedy_modularity_communities(network_graph)
# Annotate modules with enriched functions
modules = []
for i, comm in enumerate(communities):
module_proteins = list(comm)
# Run enrichment for this module
enrichment = pathway_enrichment_proteins(module_proteins)
modules.append({
'module_id': i,
'proteins': module_proteins,
'size': len(module_proteins),
'top_function': enrichment['top_terms'][0]
})
return modules
Phase 7: Multi-Omics Integration
Objective: Integrate proteomics with transcriptomics and other omics.
Protein-RNA correlation:
def correlate_protein_rna(protein_data, rna_data, common_samples):
"""
Correlate protein and mRNA levels for each gene.
Expected: r ~ 0.4-0.6 (moderate correlation)
Discordance indicates post-transcriptional regulation
"""
from scipy.stats import spearmanr
# Find common genes
common_genes = set(protein_data.index) & set(rna_data.index)
correlations = {}
for gene in common_genes:
protein = protein_data.loc[gene, common_samples]
rna = rna_data.loc[gene, common_samples]
r, p = spearmanr(protein, rna)
correlations[gene] = {
'r': r,
'p': p,
'regulation': classify_regulation(r, protein.mean(), rna.mean())
}
return correlations
def classify_regulation(r, protein_level, rna_level):
"""
Classify regulatory mechanism based on correlation and levels.
"""
if r > 0.6 and protein_level > 0 and rna_level > 0:
return 'transcriptional_upregulation'
elif r > 0.6 and protein_level < 0 and rna_level < 0:
return 'transcriptional_downregulation'
elif r < 0.2 and protein_level > 0 and rna_level < 0:
return 'translational_upregulation'
elif r < 0.2 and protein_level < 0 and rna_level > 0:
return 'protein_degradation'
else:
return 'mixed_regulation'
Integration with multi-omics skill:
def integrate_with_multiomics(protein_data, rna_data, methylation_data):
"""
Pass proteomics data to multi-omics integration skill.
Enables comprehensive analysis across all molecular layers.
"""
# Prepare for multi-omics skill
omics_data = {
'proteomics': protein_data,
'rnaseq': rna_data,
'methylation': methylation_data
}
# Invoke multi-omics integration skill
from tooluniverse import ToolUniverse
# (Would use Skill tool to invoke tooluniverse-multi-omics-integration)
return integrated_analysis
Phase 8: Report Generation
Generate comprehensive proteomics report:
# Proteomics Analysis Report
## Dataset Summary
- **Samples**: 20 (10 disease, 10 control)
- **Proteins Identified**: 5,432
- **Proteins Quantified**: 4,987 (at least 3 samples)
- **Platform**: Orbitrap Fusion Lumos, MaxQuant 2.0
## Quality Control
- **Missing Values**: 15% average per protein
- **Sample Correlation**: 0.92-0.98 within groups
- **PCA**: Clear separation between disease and control (PC1: 35% variance)
## Differential Expression
- **Significant Proteins**: 432 (adj. p < 0.05, |log2FC| > 1)
- Upregulated: 245 proteins
- Downregulated: 187 proteins
- **Top upregulated**: MYC (log2FC=3.2), EGFR (log2FC=2.8)
- **Top downregulated**: TP53 (log2FC=-2.5), BRCA1 (log2FC=-2.1)
## Phosphoproteomics
- **Phosphosites Quantified**: 8,543
- **Differentially Phosphorylated**: 234 sites (p < 0.05)
- **Top Predicted Kinases**: CDK1, MAPK1, AKT1
## Pathway Enrichment
### Top Pathways (Upregulated)
1. **Cell Cycle** (p=1e-15) - 45 proteins, including cyclins, CDKs
2. **DNA Replication** (p=1e-12) - 23 proteins
3. **Glycolysis** (p=1e-10) - 18 proteins
### Top Pathways (Downregulated)
1. **Apoptosis** (p=1e-14) - 32 proteins, including caspases
2. **DNA Repair** (p=1e-11) - 28 proteins
3. **Oxidative Phosphorylation** (p=1e-9) - 25 proteins
## Protein Network Analysis
- **Network**: 432 nodes, 1,245 edges (STRING confidence > 0.7)
- **Modules Detected**: 8 functional modules
- Module 1: Cell cycle (85 proteins)
- Module 2: Metabolism (62 proteins)
- Module 3: Translation (48 proteins)
## Protein-RNA Correlation
- **Overall Correlation**: r = 0.54 (moderate, expected)
- **High Correlation**: 2,134 genes (r > 0.6) - transcriptional regulation
- **Low Correlation**: 456 genes (r < 0.2) - post-transcriptional regulation
- **Translation-Regulated**: 89 proteins (high protein, low RNA)
## Biological Interpretation
Disease state shows increased proliferation (MYC, cyclins) with concurrent
suppression of apoptosis and DNA repair (TP53, BRCA1). Metabolic shift toward
glycolysis evident at protein level. Post-transcriptional upregulation of
translation machinery suggests adaptation to proliferative demands.
## Potential Biomarkers
Top 10 proteins for disease classification (Random Forest AUC=0.95):
1. MYC (protein)
2. EGFR (protein)
3. CDK1 (phospho-T161)
4. TP53 (protein)
5. BRCA1 (protein)
Integration with ToolUniverse
Skills Coordinated:
| Skill | Used For | Phase |
|---|---|---|
tooluniverse-gene-enrichment |
Pathway enrichment | Phase 5 |
tooluniverse-protein-interactions |
PPI networks | Phase 6 |
tooluniverse-rnaseq-deseq2 |
RNA-seq for integration | Phase 7 |
tooluniverse-multi-omics-integration |
Cross-omics analysis | Phase 7 |
tooluniverse-target-research |
Protein annotation | Phase 8 |
Example Use Cases
Use Case 1: Cancer Proteomics
Question: "Analyze proteomics data from breast cancer vs normal tissue"
Workflow:
- Load MaxQuant proteinGroups.txt
- QC and filter (keep proteins with 2+ peptides, detected in 3+ samples)
- Impute missing, normalize by median
- Differential expression (limma): 432 significant proteins
- Pathway enrichment: Cell cycle, metabolism upregulated
- STRING network: Identify hub proteins (MYC, EGFR)
- Integrate with TCGA RNA-seq: Find translation-regulated genes
- Report: Comprehensive analysis with biomarkers
Use Case 2: Phosphoproteomics Signaling
Question: "What kinase signaling is activated in response to drug treatment?"
Workflow:
- Load Phospho (STY)Sites.txt from MaxQuant
- Filter by localization probability > 0.75
- Differential phosphorylation analysis
- Kinase prediction for significant sites
- Identify MAPK1, CDK1, AKT1 as top kinases
- Pathway enrichment: MAPK, PI3K/AKT pathways
- Report: Drug activates growth signaling
Use Case 3: Protein-RNA Integration
Question: "Which proteins are regulated post-transcriptionally?"
Workflow:
- Load proteomics (MaxQuant) and RNA-seq (DESeq2) data
- Match samples, extract common genes
- Correlate protein and RNA for each gene
- Identify low-correlation genes (r < 0.2)
- Classify: translation upregulation, protein degradation
- Enrichment: Find pathways enriched in post-transcriptional regulation
- Report: 89 translation-regulated proteins, RNA-binding proteins enriched
Quantified Minimums
| Component | Requirement |
|---|---|
| Proteins quantified | At least 500 proteins |
| Replicates | At least 3 per condition |
| Filtering | 2+ unique peptides per protein |
| Statistical test | limma or t-test with multiple testing correction |
| Pathway enrichment | At least one method (GO, KEGG, or Reactome) |
| Report | Summary, QC, DE results, pathways, visualizations |
Limitations
- Platform-specific: Optimized for MS-based proteomics (not Western blot quantification)
- Missing values: High missing rate (>50% per protein) limits statistical power
- PTM analysis: Requires enrichment protocols for comprehensive PTM profiling
- Absolute quantification: Relative abundance only (unless TMT/SILAC used)
- Protein isoforms: Typically collapsed to gene level
- Dynamic range: MS has limited dynamic range vs mRNA sequencing
References
Methods:
- MaxQuant: https://doi.org/10.1038/nbt.1511
- Limma for proteomics: https://doi.org/10.1093/nar/gkv007
- DEP workflow: https://doi.org/10.1038/nprot.2018.107
Databases:
- STRING: https://string-db.org
- PhosphoSitePlus: https://www.phosphosite.org
- CORUM: https://mips.helmholtz-muenchen.de/corum
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
bio-alignment-io
Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
bio-hi-c-analysis-matrix-operations
Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.
Didn't find tool you were looking for?