Agent skill

sweetviz

Automated EDA comparison reports with target analysis, feature comparison, and HTML report generation for pandas DataFrames

View SKILL.md on GitHub Repository

Stars 4

Forks 4

Install this agent skill to your Project

npx add-skill https://github.com/vamseeachanta/workspace-hub/tree/main/.claude/skills/data/analysis/sweetviz

SKILL.md

Sweetviz

When to Use This Skill

USE Sweetviz when:

Dataset comparison - Comparing train vs test, before vs after, or any two datasets
Target variable analysis - Understanding how features relate to a target
Quick EDA reports - Need comprehensive EDA in one line of code
Feature comparison - Analyzing feature distributions across subsets
HTML reports - Creating shareable, interactive analysis reports
Intra-set analysis - Comparing subpopulations within a dataset
Data validation - Checking for data drift between datasets
Feature selection - Identifying important features for modeling

DON'T USE Sweetviz when:

Very large datasets - Over 1M rows (use sampling)
Streaming data - Need real-time analysis
Deep statistical tests - Need p-values and hypothesis testing
Custom visualizations - Specific chart requirements
Interactive dashboards - Use Streamlit or Dash instead
Text/NLP analysis - Use dedicated NLP tools

Prerequisites

bash

# Basic installation
pip install sweetviz

# Using uv (recommended)
uv pip install sweetviz pandas numpy

# With Jupyter support
pip install sweetviz pandas numpy jupyter

# Verify installation
python -c "import sweetviz as sv; print(f'Sweetviz version: {sv.__version__}')"

System Requirements

Python 3.6 or higher
pandas 0.25.3 or higher
numpy
matplotlib (for internal plotting)
Modern web browser (for viewing HTML reports)

Complete Examples

Example 1: ML Dataset Profiling Pipeline

python

#!/usr/bin/env python3
"""ml_profiling_pipeline.py - Complete ML dataset profiling with Sweetviz"""

import sweetviz as sv
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from datetime import datetime
import os

*See sub-skills for full details.*
### Example 2: Data Quality Assessment

```python
#!/usr/bin/env python3
"""data_quality_assessment.py - Data quality assessment with Sweetviz"""

import sweetviz as sv
import pandas as pd
import numpy as np
from datetime import datetime
import os
import json

*See sub-skills for full details.*
### Example 3: Feature Selection Analysis

```python
#!/usr/bin/env python3
"""feature_selection_analysis.py - Feature analysis for ML with Sweetviz"""

import sweetviz as sv
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import os


*See sub-skills for full details.*

## Version History

- **1.0.0** (2026-01-17): Initial release
  - Basic EDA report generation (analyze)
  - Target variable analysis
  - Dataset comparison (compare)
  - Intra-set comparison (compare_intra)
  - Feature configuration options
  - Pairwise analysis control
  - ML profiling pipeline example
  - Data quality assessment example
  - Feature selection analysis example
  - Streamlit integration
  - Data pipeline integration
  - Best practices and troubleshooting

## Resources

- **Official Documentation**: https://github.com/fbdesignpro/sweetviz
- **PyPI**: https://pypi.org/project/sweetviz/
- **Medium Article**: https://towardsdatascience.com/powerful-eda-exploratory-data-analysis-in-just-two-lines-of-code-using-sweetviz-6c943d32f34

---

**Generate powerful EDA comparison reports with Sweetviz - analyze, compare, and understand your data!**

## Sub-Skills

- [1. Basic EDA Report (Analyze)](1-basic-eda-report-analyze/SKILL.md)
- [2. Target Variable Analysis](2-target-variable-analysis/SKILL.md)
- [3. Dataset Comparison (Compare)](3-dataset-comparison-compare/SKILL.md)
- [4. Intra-set Comparison (Compare_Intra) (+1)](4-intra-set-comparison-compareintra/SKILL.md)
- [6. Pairwise Analysis Control](6-pairwise-analysis-control/SKILL.md)
- [Sweetviz with Streamlit (+1)](sweetviz-with-streamlit/SKILL.md)
- [Sweetviz in Data Pipeline](sweetviz-in-data-pipeline/SKILL.md)
- [1. Use Target Analysis for ML Projects (+4)](1-use-target-analysis-for-ml-projects/SKILL.md)
- [Common Issues](common-issues/SKILL.md)

Maintainer

vamseeachanta Core maintainer

Source details

Full Name: vamseeachanta/workspace-hub
Branch: main
Path in repo: .claude/skills/data/analysis/sweetviz

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

vamseeachanta/workspace-hub

gsd-complete-milestone

Archive completed milestone and prepare for next version

4 4

Explore

vamseeachanta/workspace-hub

gsd-reapply-patches

Reapply local modifications after a GSD update

4 4

Explore

vamseeachanta/workspace-hub

gsd-verify-work

Validate built features through conversational UAT

4 4

Explore

vamseeachanta/workspace-hub

gsd-thread

Manage persistent context threads for cross-session work

4 4

Explore

vamseeachanta/workspace-hub

clinical-trial-protocol

Generate clinical trial protocols for medical devices or drugs through a modular, waypoint-based architecture with research-only and full protocol modes.

4 4

Explore

vamseeachanta/workspace-hub

single-cell-rna-qc

Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations.

4 4

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Sweetviz

When to Use This Skill

USE Sweetviz when:

DON'T USE Sweetviz when:

Prerequisites

System Requirements

Complete Examples

Example 1: ML Dataset Profiling Pipeline

Recommended Agent Skills

gsd-complete-milestone

gsd-reapply-patches

gsd-verify-work

gsd-thread

clinical-trial-protocol

single-cell-rna-qc