Agent skill

data-validation-data-quality-checks

Sub-skill of data-validation: Data Quality Checks (+3).

Stars 4
Forks 4

Install this agent skill to your Project

npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/_archive/data/analytics/data-validation/data-quality-checks

SKILL.md

Data Quality Checks (+3)

Data Quality Checks

  • Source verification: Confirmed which tables/data sources were used. Are they the right ones for this question?
  • Freshness: Data is current enough for the analysis. Noted the "as of" date.
  • Completeness: No unexpected gaps in time series or missing segments.
  • Null handling: Checked null rates in key columns. Nulls are handled appropriately (excluded, imputed, or flagged).
  • Deduplication: Confirmed no double-counting from bad joins or duplicate source records.
  • Filter verification: All WHERE clauses and filters are correct. No unintended exclusions.

Calculation Checks

  • Aggregation logic: GROUP BY includes all non-aggregated columns. Aggregation level matches the analysis grain.
  • Denominator correctness: Rate and percentage calculations use the right denominator. Denominators are non-zero.
  • Date alignment: Comparisons use the same time period length. Partial periods are excluded or noted.
  • Join correctness: JOIN types are appropriate (INNER vs LEFT). Many-to-many joins haven't inflated counts.
  • Metric definitions: Metrics match how stakeholders define them. Any deviations are noted.
  • Subtotals sum: Parts add up to the whole where expected. If they don't, explain why (e.g., overlap).

Reasonableness Checks

  • Magnitude: Numbers are in a plausible range. Revenue isn't negative. Percentages are between 0-100%.
  • Trend continuity: No unexplained jumps or drops in time series.
  • Cross-reference: Key numbers match other known sources (dashboards, previous reports, finance data).
  • Order of magnitude: Total revenue is in the right ballpark. User counts match known figures.
  • Edge cases: What happens at the boundaries? Empty segments, zero-activity periods, new entities.

Presentation Checks

  • Chart accuracy: Bar charts start at zero. Axes are labeled. Scales are consistent across panels.
  • Number formatting: Appropriate precision. Consistent currency/percentage formatting. Thousands separators where needed.
  • Title clarity: Titles state the insight, not just the metric. Date ranges are specified.
  • Caveat transparency: Known limitations and assumptions are stated explicitly.
  • Reproducibility: Someone else could recreate this analysis from the documentation provided.

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results