Agent skill
jupytext
This skill should be used when the user asks to "convert notebook to text", "use jupytext", "version control notebooks", "share data between kernels", "set up multi-kernel project", "pair notebooks with Python files", "sync ipynb and py files", or needs multi-kernel projects (Python/R/Stata/SAS) with version-control-friendly notebooks.
Install this agent skill to your Project
npx add-skill https://github.com/edwinhu/workflows/tree/main/skills/jupytext
SKILL.md
Contents
- Execution Enforcement
- Core Concepts
- Multi-Kernel Data Sharing
- Workflow Integration
- Project Structure
- Kernel Specification
- Quick Troubleshooting
- Additional Resources
- Best Practices
Jupytext Skill
Jupytext converts Jupyter notebooks to/from text formats (.py, .R, .md), enabling version control and multi-kernel workflows.
Execution Enforcement
IRON LAW: NO EXECUTION CLAIM WITHOUT OUTPUT VERIFICATION
Before claiming ANY jupytext script executed successfully, follow this sequence:
- EXECUTE using the papermill pipeline:
jupytext --to notebook --output - script.py | papermill - output.ipynb - CHECK for execution errors (papermill exit code and stderr)
- VERIFY output.ipynb exists and is non-empty
- INSPECT outputs using notebook-debug skill verification
- CLAIM success only after verification passes
This is non-negotiable. Skipping papermill execution is NOT HELPFUL — the user gets a notebook that fails on first run.
Rationalization Table - STOP If You Think:
| Excuse | Reality | Do Instead |
|---|---|---|
| "I converted to ipynb, so it works" | Conversion ≠ execution | EXECUTE with papermill, not just convert |
| "The .py file looks correct" | Syntax correctness ≠ runtime correctness | RUN and CHECK outputs |
| "I'll let the user execute it" | You're passing broken code | VERIFY before claiming completion |
| "Just a conversion task, no execution needed" | User expects working notebook | EXECUTE to confirm it works |
"I can use jupyter nbconvert --execute" |
Papermill has better error handling | USE the recommended papermill pipeline |
| "I'll save the intermediate ipynb first" | Creates clutter | USE the recommended pipeline (no intermediate files) |
| "Exit code 0 means success" | Papermill can succeed with errors in cells | CHECK output.ipynb for tracebacks |
Red Flags - STOP Immediately If You Think:
- "Let me just convert and return the ipynb" → NO. EXECUTE with papermill first.
- "The .py file is simple, can't have errors" → NO. Simple code fails too.
- "I'll execute without papermill" → NO. Use the recommended pipeline.
- "Conversion completed, so job done" → NO. Execution verification required.
Execution Verification Checklist
Before EVERY "notebook works" claim:
Conversion:
- Correct format specified (py:percent recommended)
- Conversion command succeeded
- No syntax errors in conversion
Execution (MANDATORY):
- Used recommended papermill pipeline:
jupytext --to notebook --output - script.py | papermill - output.ipynb - Papermill exit code is 0
- No errors in stderr
- output.ipynb file created
- output.ipynb is non-empty (>100 bytes)
Output Verification:
- Used notebook-debug skill's verification checklist
- No tracebacks in any cell
- All cells have execution_count (not null)
- Expected outputs present (plots, dataframes, metrics)
- No unexpected warnings or errors
Multi-Kernel Projects (if applicable):
- Correct kernel specified in header
- Interchange files created (parquet/DTA)
- Downstream notebooks can read interchange files
Only after ALL checks pass:
- Claim "notebook executed successfully"
Gate Function: Jupytext Execution
Follow this sequence for EVERY jupytext task involving execution:
1. CONVERT → jupytext --to notebook --output -
2. EXECUTE → papermill - output.ipynb (with params if needed)
3. CHECK → Verify exit code and stderr
4. INSPECT → Use notebook-debug verification
5. VERIFY → Outputs match expectations
6. CLAIM → "Notebook works" only after all gates passed
NEVER skip execution gate. Converting without executing proves nothing about correctness.
Drive-Aligned Framing
Skipping papermill execution is NOT HELPFUL — the user gets a notebook that looks correct but fails when they run it.
This is not just format conversion - verify that the notebook executes correctly. The user expects a working notebook, not just syntactically valid code.
Core Concepts
Percent Format (Recommended)
Use percent format (py:percent) for all projects:
# %% [markdown]
# # Analysis Title
# %%
import pandas as pd
df = pd.read_csv("data.csv")
# %% tags=["parameters"]
input_file = "data.csv"
Cell markers: # %% for code, # %% [markdown] for markdown.
Markdown dollar signs: Always wrap $ in backticks to prevent LaTeX rendering - # Cost: $50`` not # Cost: $50
Project Configuration
Create jupytext.toml in project root:
formats = "ipynb,py:percent"
notebook_metadata_filter = "-all"
cell_metadata_filter = "-all"
Essential Commands
# Convert notebook to percent-format Python file
jupytext --to py:percent notebook.ipynb
# Convert Python script to Jupyter notebook format
jupytext --to notebook script.py
# Enable bidirectional pairing to keep formats synchronized
jupytext --set-formats ipynb,py:percent notebook.ipynb
# Synchronize paired notebook and text file
jupytext --sync notebook.ipynb
Execution (Recommended Pattern)
Always pipe to papermill for execution - no intermediate files:
# Convert script to notebook and execute in atomic operation
jupytext --to notebook --output - script.py | papermill - output.ipynb
# Convert and execute with parameter injection
jupytext --to notebook --output - script.py | papermill - output.ipynb -p start_date "2024-01-01" -p n_samples 1000
# Convert and execute with detailed logging output
jupytext --to notebook --output - script.py | papermill - output.ipynb --log-output
# Convert and execute in memory without saving intermediate files
jupytext --to notebook --output - script.py | papermill - -
Key flags:
--output -tells jupytext to write to stdoutpapermill - output.ipynbreads from stdin, writes to filepapermill - -reads from stdin, writes to stdout (for inspection)
Why this pattern:
- No intermediate
.ipynbfiles cluttering the workspace - Single atomic operation - convert and execute together
- Papermill handles parameters, logging, and error reporting
- Works in CI/CD pipelines without temp file cleanup
Debugging Runtime Errors
After execution, use notebook-debug skill to inspect tracebacks in the output ipynb.
Multi-Kernel Data Sharing
Share data between Python/R/Stata/SAS via files:
| Route | Format | Write | Read |
|---|---|---|---|
| Python -> R | Parquet | df.to_parquet() |
arrow::read_parquet() |
| Python -> Stata | DTA | df.to_stata() |
use "file.dta" |
| Any -> Any | CSV | Native | Native |
| SQL queries | DuckDB | Query parquet directly | Query parquet directly |
Cross-Kernel Pipeline Pattern
Python (prep) -> Parquet -> R (stats) -> Parquet -> Python (report)
|
v
Stata (.dta) -> Econometrics
Workflow Integration
Git Pre-commit Hook
Add the following to .pre-commit-config.yaml:
repos:
- repo: https://github.com/mwouts/jupytext
rev: v1.16.0
hooks:
- id: jupytext
args: [--sync] # Synchronize paired formats before commit
Version Control Strategy
Choose one approach:
- Option A: Commit only .py files (add
*.ipynbto.gitignore) for minimal repository size - Option B: Commit both formats to give reviewers format choice
Editor Integration
Configure editors for automatic synchronization:
- VS Code: Install Jupytext extension for automatic bidirectional sync
- JupyterLab: Right-click notebook and select "Pair Notebook" for synchronization
Project Structure
Standard multi-kernel project layout:
project/
├── jupytext.toml # Project-wide settings
├── environment.yml # Conda env with all kernels
├── notebooks/
│ ├── 01_python_prep.py # Python percent format
│ ├── 02_r_analysis.R # R percent format
│ └── 03_stata_models.do # Stata script
├── data/
│ ├── raw/
│ └── processed/ # Parquet/DTA interchange files
└── results/
Kernel Specification
Specify kernel in file header:
# ---
# jupyter:
# kernelspec:
# display_name: Python 3
# language: python
# name: python3
# ---
# %% [markdown]
# # Python Analysis
Quick Troubleshooting
| Issue | Solution |
|---|---|
| Sync conflict | Delete .ipynb, regenerate from .py |
| Wrong kernel | Add kernelspec header to .py file |
| Metadata noise | Set notebook_metadata_filter = "-all" |
| Cell order lost | Use percent format (preserves structure) |
Additional Resources
Reference Files
Detailed patterns and configurations:
references/formats.md- All format specifications (percent, light, sphinx, myst, rmd, quarto), cell metadata, configuration optionsreferences/kernels.md- Kernel setup (IRkernel, xeus-r, stata_kernel, pystata, saspy), environment configuration, troubleshootingreferences/data-sharing.md- Cross-kernel data sharing patterns (parquet, dta, csv, duckdb), full pipeline examples, validation patterns
Example Files
Working code in examples/:
examples/python_analysis.py- Python percent-format template with common patternsexamples/r_analysis.R- R percent-format template for statistical analysisexamples/cross_kernel_pipeline.py- Multi-kernel data sharing example
Scripts
Utility scripts in scripts/:
scripts/init_project.sh- Initialize jupytext project with standard structurescripts/sync_all.sh- Sync all paired notebooks in project
Best Practices
- Use percent format - Best balance of readability and cell preservation
- Strip metadata for git - Use metadata filters for cleaner diffs
- Use parquet for interchange - Type-safe, cross-language compatible format
- Document kernel requirements - Include in README or environment.yml
- Enable pre-commit hooks - Ensure synchronization before commits
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
audit-fix-loop
This skill should be used when the user asks to 'iteratively improve', 'audit and fix', 'hill-climb quality', 'grade and improve', 'score and fix', 'audit loop', 'quality loop', or needs structured iterative improvement of an artifact using scored independent audits. Also use when the user invokes a ralph loop for quality improvement rather than task completion.
ds-spec-reviewer
Internal skill used by ds-brainstorm at Phase 1 exit gate. Dispatches a reviewer subagent to verify SPEC.md completeness before planning. NOT user-facing.
pptx-render
Use when the user asks to "render pptx", "show pptx slide", "compare with pptx", "pptx to image", "export pptx slide", "original slide", "show me the original", "what does the pptx look like", or needs to extract a specific PPTX slide's content for visual comparison.
obsidian-organize
Organize Obsidian notes according to clawd's preferences. Use when user asks to "organize notes", "move notes to right folder", "clean up vault", "tidy vault", "file this note", or when creating new notes in the Obsidian vault. Also use when moving, renaming, or categorizing notes, or when the vault root has stray files.
dev-verify
This skill should be used when the user asks to 'verify completion', 'check that tests pass', 'confirm feature works', or REQUIRED Phase 7 of /dev workflow (final). Enforces fresh runtime evidence before claiming completion.
dev
This skill should be used when the user asks to 'start a feature', 'build a feature', 'implement a feature', 'develop', 'new feature', or needs the full 7-phase development workflow with TDD enforcement.
Didn't find tool you were looking for?