Agent skill
numpy-statistics
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/numpy-statistics
SKILL.md
Overview
NumPy provides a suite of statistical functions for summarizing data. Key capabilities include calculating central tendencies, dispersion, and relationships between variables, with specific handling for missing values (NaNs).
When to Use
- Summarizing experimental data (mean, median, standard deviation).
- Visualizing data distributions via histogram counts and binning.
- Identifying relationships between multiple variables using correlation matrices.
- Analyzing datasets with missing values where standard aggregations would fail.
Decision Tree
- Does your data contain
NaN?- Yes: Use
nanprefixed functions (e.g.,np.nanmean). - No: Use standard functions (e.g.,
np.mean).
- Yes: Use
- Creating a histogram?
- Need normalized area? Set
density=True. - Fixed bin widths? Provide an integer for
binsor an array for custom edges.
- Need normalized area? Set
- Checking correlation?
- Use
np.corrcoef. Note: output may require clipping if float errors occur.
- Use
Workflows
-
Robust Mean Calculation
- Identify an array with potential missing values (NaNs).
- Calculate the mean using
np.nanmean(arr). - Optionally use
np.nanstd(arr)to find the standard deviation of the valid subset.
-
Custom Histogram Creation
- Define a set of non-uniform bin edges
[0, 5, 10, 50, 100]. - Pass the data and edges to
np.histogram(data, bins=edges). - Retrieve the counts and the validated edges for plotting.
- Define a set of non-uniform bin edges
-
Inter-Variable Correlation Analysis
- Stack multiple data variables into a 2D array (rows as variables).
- Execute
np.corrcoef(data). - Inspect the off-diagonal elements for Pearson correlation strengths.
Non-Obvious Insights
- NaN Sensitivity: Standard statistical functions return
NaNif even one element is missing; thenanversions are essential for real-world messy data. - Histogram Density: The
density=Trueflag ensures the integral over the histogram is 1, not that the sum of the counts is 1 (unless bin widths are 1). - Precision Clipping: Correlation coefficients can occasionally drift outside
[-1, 1]due to floating-point rounding; NumPy automatically mitigates this incorrcoefresults.
Evidence
- "nanmean... Compute the arithmetic mean along the specified axis, ignoring NaNs." Source
- "Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function." Source
Scripts
scripts/numpy-statistics_tool.py: Computes robust statistics and custom histograms.scripts/numpy-statistics_tool.js: Basic mean calculator.
Dependencies
numpy(Python)
References
- references/README.md
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?