Agent skill

cudf-analytics

Use for GPU-accelerated data analysis on datasets, CSVs, or tabular data using NVIDIA cuDF. Triggers when tasks involve groupby aggregations, statistical summaries, anomaly detection, or large-scale data profiling.

View SKILL.md on GitHub Repository

Stars 18,556

Forks 2,584

Install this agent skill to your Project

npx add-skill https://github.com/langchain-ai/deepagents/tree/main/examples/nvidia_deep_agent/skills/cudf-analytics

SKILL.md

cuDF Analytics Skill

GPU-accelerated data analysis using NVIDIA RAPIDS cuDF. cuDF provides a pandas-like API that runs on NVIDIA GPUs, enabling massive speedups on large datasets.

When to Use This Skill

Use this skill when:

Analyzing CSV files, datasets, or tabular data
Computing statistical summaries (mean, median, std, quartiles)
Performing groupby aggregations
Detecting anomalies or outliers in data
Profiling datasets with millions of rows
Computing correlation matrices

Initialization (REQUIRED)

Always start every script with this boilerplate. It tests actual GPU operations, not just import.

python

import pandas as pd

try:
    import cudf
    # Smoke-test: verify GPU compute AND host transfer both work
    _test = cudf.Series([1, 2, 3])
    assert _test.sum() == 6
    assert _test.to_pandas().tolist() == [1, 2, 3]
    GPU = True
except Exception as e:
    print(f"[GPU] cudf unavailable, falling back to pandas: {e}")
    GPU = False

def read_csv(path):
    return cudf.read_csv(path) if GPU else pd.read_csv(path)

def to_pd(df):
    """Convert cuDF DataFrame/Series to pandas. Use this instead of .to_pandas() directly."""
    if not GPU:
        return df
    try:
        return df.to_pandas()
    except Exception as e:
        print(f"[GPU] .to_pandas() failed, using Arrow fallback: {e}")
        return df.to_arrow().to_pandas()

Quick Reference

cuDF mirrors the pandas API. Common operations:

Read Data

python

df = read_csv("data.csv")

Statistical Summary

python

# Use to_pd() when you need pandas output
summary = to_pd(df[["value", "score"]].describe())

# Scalar values work directly with float()
mean_val = float(df["value"].mean())
q1 = float(df["value"].quantile(0.25))

# Correlation
corr = float(df["value"].corr(df["score"]))

Groupby Aggregation

python

result = df.groupby("category").agg({
    "revenue": ["sum", "mean", "count"],
    "quantity": ["sum", "mean"],
})
result_pd = to_pd(result)

Anomaly Detection (IQR Method)

python

col = "value"
Q1 = float(df[col].quantile(0.25))
Q3 = float(df[col].quantile(0.75))
IQR = Q3 - Q1
lower = Q1 - 1.5 * IQR
upper = Q3 + 1.5 * IQR
outliers = to_pd(df[(df[col] < lower) | (df[col] > upper)])

Anomaly Detection (Z-Score Method)

python

mean = float(df[col].mean())
std = float(df[col].std())
df["z_score"] = (df[col] - mean) / std
anomalies = to_pd(df[df["z_score"].abs() > 3])

Filtering and Selection

python

# Filter rows
filtered = df[df["status"] == "active"]

# Select columns
subset = df[["name", "revenue", "date"]]

# Sort
sorted_df = df.sort_values("revenue", ascending=False)

# Convert to pandas for final output / iteration
result_pd = to_pd(sorted_df)

Data Type Requirements

cuDF requires explicit type specification for optimal performance:

Use float32 or float64 for numeric data
Use int32 or int64 for integer data
String columns use cuDF's string dtype automatically

Output Guidelines

When reporting analysis results:

Include dataset dimensions (rows x columns)
Show key statistics in formatted tables
Highlight notable patterns, trends, or anomalies
Provide both summary statistics and specific examples
Note any data quality issues (missing values, outliers)

Maintainer

langchain-ai Core maintainer

Source details

Full Name: langchain-ai/deepagents
Branch: main
Path in repo: examples/nvidia_deep_agent/skills/cudf-analytics
License: MIT License
Topics: ai langgraph deepagents langchain

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

langchain-ai/deepagents

cuml-machine-learning

Use for GPU-accelerated machine learning on tabular data using NVIDIA cuML. Triggers when tasks involve classification, regression, clustering, dimensionality reduction, or model training on datasets.

18,556 2,584

Explore

langchain-ai/deepagents

data-visualization

Use for creating publication-quality charts and multi-panel analysis summaries. Triggers when tasks involve visualizing data, plotting results, creating charts, or producing visual reports from analysis output.

18,556 2,584

Explore

langchain-ai/deepagents

gpu-document-processing

Use when processing large PDFs, document collections, or bulk text extraction tasks that benefit from GPU-accelerated processing. Triggers when the user provides large documents or needs bulk document analysis.

18,556 2,584

Explore

langchain-ai/deepagents

schema-exploration

Lists tables, describes columns and data types, identifies foreign key relationships, and maps entity relationships in a database. Use when the user asks about database schema, table structure, column types, what tables exist, ERD, foreign keys, or how entities relate.

18,556 2,584

Explore

langchain-ai/deepagents

query-writing

Writes and executes SQL queries from simple SELECTs to complex multi-table JOINs, aggregations, and subqueries. Use when the user asks to query a database, write SQL, run a SELECT statement, retrieve data, filter records, or generate reports from database tables.

18,556 2,584

Explore

langchain-ai/deepagents

social-media

Drafts engaging social media posts, writes hooks, suggests hashtags, creates thread structures, and generates companion images. Use when the user asks to write a LinkedIn post, tweet, Twitter/X thread, social media caption, social post, or repurpose content for social platforms.

18,556 2,584

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

cuDF Analytics Skill

When to Use This Skill

Initialization (REQUIRED)

Quick Reference

Read Data

Statistical Summary

Groupby Aggregation

Anomaly Detection (IQR Method)

Anomaly Detection (Z-Score Method)

Filtering and Selection

Data Type Requirements

Output Guidelines

Recommended Agent Skills

cuml-machine-learning

data-visualization

gpu-document-processing

schema-exploration

query-writing

social-media