Agent skill

eval-notebook

Execute .ipynb notebooks (Python, Kotlin, or any Jupyter kernel) without overwriting; return LLM-friendly JSON with outputs and errors. Use when you need to run or validate a Jupyter notebook.

View SKILL.md on GitHub Repository

Stars 1

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/geggo98/dotfiles/tree/main/modules/ai/_files/skills/notebook

SKILL.md

Notebook Evaluator

1. Purpose

Use this skill to execute Jupyter notebooks (.ipynb) safely without modifying the original file. It evaluates notebooks using their configured kernel and returns structured JSON output with execution results, captured outputs, and any errors—perfect for LLM consumption and automated testing.

Important: Run the script directly (./scripts/eval_notebook.sh). Do not prefix with bash — the script requires zsh and will fail under bash.

2. Usage Scenarios

Run before:

Validating notebook changes in a pull request
Testing notebooks in CI/CD pipelines
Debugging notebook execution errors
Verifying notebook reproducibility

3. Helper Scripts

Script	Purpose	Arguments
`scripts/eval_notebook.sh`	Entry point that delegates to Python evaluator	Forwards all arguments to `eval_notebook.py`

The wrapper script enforces a global execution timeout via gtimeout (default: 15m). Pass --timeout DURATION to override it. The duration format follows GNU coreutils (e.g. 30s, 5m, 1h). This is separate from the per-notebook --timeout SECONDS option handled by the Python evaluator.

Arguments

Required: One or more paths to .ipynb notebook files
Optional: See CLI options below

4. CLI Options

Option	Default	Description
`--timeout SECONDS`	600	Maximum execution time per notebook
`--iopub-timeout SECONDS`	30	Timeout for IOPUB messages
`--fail-fast`	false	Stop on first error instead of continuing
`--max-output-chars N`	4000	Truncate outputs after N characters
`--max-outputs-per-cell N`	6	Limit outputs captured per cell
`--pretty`	false	Pretty-print JSON output

Warning: Notebook cells can produce huge output, e.g., when producing diagrams. Make sure to alway choose sane outputs for individual cells.

5. Examples

Basic Evaluation

bash

./scripts/eval_notebook.sh analysis.ipynb --pretty

Executes the notebook and returns pretty-printed JSON with results.

Multiple Notebooks

bash

./scripts/eval_notebook.sh notebook1.ipynb notebook2.ipynb

Returns an array of result objects, one per notebook.

Strict Evaluation

bash

./scripts/eval_notebook.sh analysis.ipynb --fail-fast --timeout 120

Stops immediately on any error with a 2-minute timeout.

6. Output Format

Single Notebook Result

json

{
  "notebook": "/path/to/notebook.ipynb",
  "cwd": "/path/to",
  "kernelspec": "python3",
  "status": "ok",
  "duration_ms": 1234,
  "exec_exception": null,
  "error_count": 0,
  "errors": [],
  "cells": [
    {
      "index": 0,
      "execution_count": 1,
      "source_preview": "print('hello')",
      "outputs": [
        {
          "type": "stream",
          "name": "stdout",
          "text": "hello\n"
        }
      ],
      "output_count": 1
    }
  ]
}

Cell Output Types

Type	Fields	Description
`stream`	`name`, `text`	Standard output/error streams
`execute_result`	`mime`, `text`	Last expression result
`display_data`	`mime`, `text` or `image_base64_len`	Rich display (images, HTML)
`error`	`ename`, `evalue`, `traceback`	Python exception

7. Exit Codes

Code	Meaning
0	Success (notebook executed, may contain errors in results)
1	Script error (invalid arguments, file not found)

Note: Cell execution errors are reported in the JSON output; the script itself succeeds if it can evaluate the notebook.

8. Your Task

When processing evaluation results:

If status=ok: Provide a concise summary of key outputs and execution time.
If status=error:
- List each error by cell_index with ename, evalue, and relevant traceback lines
- Identify the most likely root cause
- Propose the fastest verification step
- If code changes are needed, describe them precisely
Never overwrite the original notebook file—this skill is read-only by design.

Maintainer

geggo98 Core maintainer

Source details

Full Name: geggo98/dotfiles
Branch: main
Path in repo: modules/ai/_files/skills/notebook
License: Creative Commons Zero v1.0 Universal

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

geggo98/dotfiles

nix-shell

Search Nix packages and run commands with packages from nixpkgs that are not installed locally. Use when you need a package not available locally or want to search nixpkgs.

1 0

Explore

geggo98/dotfiles

tmux

Remote control tmux sessions for interactive CLIs (python, gdb, etc.) by sending keystrokes and scraping pane output.

1 0

Explore

geggo98/dotfiles

slidev

Create and present web-based slidedecks for developers using Slidev with Markdown, Vue components, code highlighting, animations, and interactive features. Use when building technical presentations, conference talks, code walkthroughs, teaching materials, or developer decks. Also trigger when the user mentions Slidev, sli.dev, slide decks with code, or wants to create developer-facing presentations.

1 0

Explore

geggo98/dotfiles

diagram-render

Render PlantUML (@startuml…@enduml) and Mermaid fenced blocks to a self-contained HTML preview; if rendering fails, the error text must be embedded in the output image. Use when the user asks to render, preview, or export diagrams.

1 0

Explore

geggo98/dotfiles

adr-writing

Use when documenting significant architectural decisions. Creates focused ADRs explaining context, decision, and alternatives. Prevents vague documentation and implementation detail bloat. Triggers: 'create ADR', 'document decision', making technology/framework/persistence/auth choices, cross-cutting concerns.

1 0

Explore

geggo98/dotfiles

writing-clearly-and-concisely

Use when writing documentation, commit messages, error text, explanations, reports, or summaries. Applies Strunk's principles for clear, vigorous prose. Triggers: writing human-readable content, verbose text, unclear explanations.

1 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Notebook Evaluator

1. Purpose

2. Usage Scenarios

3. Helper Scripts

Arguments

4. CLI Options

5. Examples

Basic Evaluation

Multiple Notebooks

Strict Evaluation

6. Output Format

Single Notebook Result

Cell Output Types

7. Exit Codes

8. Your Task

Recommended Agent Skills

nix-shell

tmux

slidev

diagram-render

adr-writing

writing-clearly-and-concisely