Agent skill
sweetviz
Automated EDA comparison reports with target analysis, feature comparison, and HTML report generation for pandas DataFrames
Install this agent skill to your Project
npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/data/analysis/sweetviz
SKILL.md
Sweetviz
When to Use This Skill
USE Sweetviz when:
- Dataset comparison - Comparing train vs test, before vs after, or any two datasets
- Target variable analysis - Understanding how features relate to a target
- Quick EDA reports - Need comprehensive EDA in one line of code
- Feature comparison - Analyzing feature distributions across subsets
- HTML reports - Creating shareable, interactive analysis reports
- Intra-set analysis - Comparing subpopulations within a dataset
- Data validation - Checking for data drift between datasets
- Feature selection - Identifying important features for modeling
DON'T USE Sweetviz when:
- Very large datasets - Over 1M rows (use sampling)
- Streaming data - Need real-time analysis
- Deep statistical tests - Need p-values and hypothesis testing
- Custom visualizations - Specific chart requirements
- Interactive dashboards - Use Streamlit or Dash instead
- Text/NLP analysis - Use dedicated NLP tools
Prerequisites
# Basic installation
pip install sweetviz
# Using uv (recommended)
uv pip install sweetviz pandas numpy
# With Jupyter support
pip install sweetviz pandas numpy jupyter
# Verify installation
python -c "import sweetviz as sv; print(f'Sweetviz version: {sv.__version__}')"
System Requirements
- Python 3.6 or higher
- pandas 0.25.3 or higher
- numpy
- matplotlib (for internal plotting)
- Modern web browser (for viewing HTML reports)
Complete Examples
Example 1: ML Dataset Profiling Pipeline
#!/usr/bin/env python3
"""ml_profiling_pipeline.py - Complete ML dataset profiling with Sweetviz"""
import sweetviz as sv
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from datetime import datetime
import os
*See sub-skills for full details.*
### Example 2: Data Quality Assessment
```python
#!/usr/bin/env python3
"""data_quality_assessment.py - Data quality assessment with Sweetviz"""
import sweetviz as sv
import pandas as pd
import numpy as np
from datetime import datetime
import os
import json
*See sub-skills for full details.*
### Example 3: Feature Selection Analysis
```python
#!/usr/bin/env python3
"""feature_selection_analysis.py - Feature analysis for ML with Sweetviz"""
import sweetviz as sv
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import os
*See sub-skills for full details.*
## Version History
- **1.0.0** (2026-01-17): Initial release
- Basic EDA report generation (analyze)
- Target variable analysis
- Dataset comparison (compare)
- Intra-set comparison (compare_intra)
- Feature configuration options
- Pairwise analysis control
- ML profiling pipeline example
- Data quality assessment example
- Feature selection analysis example
- Streamlit integration
- Data pipeline integration
- Best practices and troubleshooting
## Resources
- **Official Documentation**: https://github.com/fbdesignpro/sweetviz
- **PyPI**: https://pypi.org/project/sweetviz/
- **Medium Article**: https://towardsdatascience.com/powerful-eda-exploratory-data-analysis-in-just-two-lines-of-code-using-sweetviz-6c943d32f34
---
**Generate powerful EDA comparison reports with Sweetviz - analyze, compare, and understand your data!**
## Sub-Skills
- [1. Basic EDA Report (Analyze)](1-basic-eda-report-analyze/SKILL.md)
- [2. Target Variable Analysis](2-target-variable-analysis/SKILL.md)
- [3. Dataset Comparison (Compare)](3-dataset-comparison-compare/SKILL.md)
- [4. Intra-set Comparison (Compare_Intra) (+1)](4-intra-set-comparison-compareintra/SKILL.md)
- [6. Pairwise Analysis Control](6-pairwise-analysis-control/SKILL.md)
- [Sweetviz with Streamlit (+1)](sweetviz-with-streamlit/SKILL.md)
- [Sweetviz in Data Pipeline](sweetviz-in-data-pipeline/SKILL.md)
- [1. Use Target Analysis for ML Projects (+4)](1-use-target-analysis-for-ml-projects/SKILL.md)
- [Common Issues](common-issues/SKILL.md)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
gsd-complete-milestone
Archive completed milestone and prepare for next version
gsd-reapply-patches
Reapply local modifications after a GSD update
gsd-verify-work
Validate built features through conversational UAT
gsd-thread
Manage persistent context threads for cross-session work
clinical-trial-protocol
Generate clinical trial protocols for medical devices or drugs through a modular, waypoint-based architecture with research-only and full protocol modes.
single-cell-rna-qc
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations.
Didn't find tool you were looking for?