Agent skill
statistical-analysis-statistical-methods
Sub-skill of statistical-analysis: Statistical Methods (+2).
Install this agent skill to your Project
npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/_archive/data/analytics/statistical-analysis/statistical-methods
SKILL.md
Statistical Methods (+2)
Statistical Methods
Z-score method (for normally distributed data):
z_scores = (df['value'] - df['value'].mean()) / df['value'].std()
outliers = df[abs(z_scores) > 3] # More than 3 standard deviations
IQR method (robust to non-normal distributions):
Q1 = df['value'].quantile(0.25)
Q3 = df['value'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers = df[(df['value'] < lower_bound) | (df['value'] > upper_bound)]
Percentile method (simplest):
outliers = df[(df['value'] < df['value'].quantile(0.01)) |
(df['value'] > df['value'].quantile(0.99))]
Handling Outliers
Do NOT automatically remove outliers. Instead:
- Investigate: Is this a data error, a genuine extreme value, or a different population?
- Data errors: Fix or remove (e.g., negative ages, timestamps in year 1970)
- Genuine extremes: Keep them but consider using robust statistics (median instead of mean)
- Different population: Segment them out for separate analysis (e.g., enterprise vs. SMB customers)
Report what you did: "We excluded 47 records (0.3%) with transaction amounts >$50K, which represent bulk enterprise orders analyzed separately."
Time Series Anomaly Detection
For detecting unusual values in a time series:
- Compute expected value (moving average or same-period-last-year)
- Compute deviation from expected
- Flag deviations beyond a threshold (typically 2-3 standard deviations of the residuals)
- Distinguish between point anomalies (single unusual value) and change points (sustained shift)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
gsd-complete-milestone
Archive completed milestone and prepare for next version
gsd-reapply-patches
Reapply local modifications after a GSD update
gsd-verify-work
Validate built features through conversational UAT
gsd-thread
Manage persistent context threads for cross-session work
clinical-trial-protocol
Generate clinical trial protocols for medical devices or drugs through a modular, waypoint-based architecture with research-only and full protocol modes.
single-cell-rna-qc
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations.
Didn't find tool you were looking for?