Agent skill
data-analysis
Analyze datasets to extract insights, identify patterns, and generate reports. Use when exploring data, creating visualizations, or performing statistical analysis. Handles CSV, JSON, SQL queries, and Python pandas operations.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/supercent-io/data-analysis
Metadata
Additional technical details for this skill
- tags
- data, analysis, pandas, statistics, visualization, csv, sql
- platforms
- Claude, ChatGPT, Gemini
SKILL.md
Data Analysis
When to use this skill
- Data exploration: Understand a new dataset
- Report generation: Derive data-driven insights
- Quality validation: Check data consistency
- Decision support: Make data-driven recommendations
Instructions
Step 1: Load and explore data
Python (Pandas):
import pandas as pd
import numpy as np
# Load CSV
df = pd.read_csv('data.csv')
# Basic info
print(df.info())
print(df.describe())
print(df.head(10))
# Check missing values
print(df.isnull().sum())
# Data types
print(df.dtypes)
SQL:
-- Inspect table schema
DESCRIBE table_name;
-- Sample data
SELECT * FROM table_name LIMIT 10;
-- Basic stats
SELECT
COUNT(*) as total_rows,
COUNT(DISTINCT column_name) as unique_values,
MIN(numeric_column) as min_val,
MAX(numeric_column) as max_val,
AVG(numeric_column) as avg_val
FROM table_name;
Step 2: Data cleaning
# Handle missing values
df['column'].fillna(df['column'].mean(), inplace=True)
df.dropna(subset=['required_column'], inplace=True)
# Remove duplicates
df.drop_duplicates(inplace=True)
# Type conversions
df['date'] = pd.to_datetime(df['date'])
df['category'] = df['category'].astype('category')
# Remove outliers (IQR method)
Q1 = df['value'].quantile(0.25)
Q3 = df['value'].quantile(0.75)
IQR = Q3 - Q1
df = df[(df['value'] >= Q1 - 1.5*IQR) & (df['value'] <= Q3 + 1.5*IQR)]
Step 3: Statistical analysis
# Descriptive statistics
print(df['numeric_column'].describe())
# Grouped analysis
grouped = df.groupby('category').agg({
'value': ['mean', 'sum', 'count'],
'other': 'nunique'
})
print(grouped)
# Correlation
correlation = df[['col1', 'col2', 'col3']].corr()
print(correlation)
# Pivot table
pivot = pd.pivot_table(df,
values='sales',
index='region',
columns='month',
aggfunc='sum'
)
Step 4: Visualization
import matplotlib.pyplot as plt
import seaborn as sns
# Histogram
plt.figure(figsize=(10, 6))
df['value'].hist(bins=30)
plt.title('Distribution of Values')
plt.savefig('histogram.png')
# Boxplot
plt.figure(figsize=(10, 6))
sns.boxplot(x='category', y='value', data=df)
plt.title('Value by Category')
plt.savefig('boxplot.png')
# Heatmap (correlation)
plt.figure(figsize=(10, 8))
sns.heatmap(correlation, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.savefig('heatmap.png')
# Time series
plt.figure(figsize=(12, 6))
df.groupby('date')['value'].sum().plot()
plt.title('Time Series of Values')
plt.savefig('timeseries.png')
Step 5: Derive insights
# Top/bottom analysis
top_10 = df.nlargest(10, 'value')
bottom_10 = df.nsmallest(10, 'value')
# Trend analysis
df['month'] = df['date'].dt.to_period('M')
monthly_trend = df.groupby('month')['value'].sum()
growth = monthly_trend.pct_change() * 100
# Segment analysis
segments = df.groupby('segment').agg({
'revenue': 'sum',
'customers': 'nunique',
'orders': 'count'
})
segments['avg_order_value'] = segments['revenue'] / segments['orders']
Output format
Analysis report structure
# Data Analysis Report
## 1. Dataset overview
- Dataset: [name]
- Records: X,XXX
- Columns: XX
- Date range: YYYY-MM-DD ~ YYYY-MM-DD
## 2. Key findings
- Insight 1
- Insight 2
- Insight 3
## 3. Statistical summary
| Metric | Value |
|------|-----|
| Mean | X.XX |
| Median | X.XX |
| Std dev | X.XX |
## 4. Recommendations
1. [Recommendation 1]
2. [Recommendation 2]
Best practices
- Understand the data first: Learn structure and meaning before analysis
- Incremental analysis: Move from simple to complex analyses
- Use visualization: Use a variety of charts to spot patterns
- Validate assumptions: Always verify assumptions about the data
- Reproducibility: Document analysis code and results
Constraints
Required rules (MUST)
- Preserve raw data (work on a copy)
- Document the analysis process
- Validate results
Prohibited (MUST NOT)
- Do not expose sensitive personal data
- Do not draw unsupported conclusions
References
Examples
Example 1: Basic usage
Example 2: Advanced usage
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?