Agent skills
implement-paper-from-scratch

Agent skill

implement-paper-from-scratch

Guides you through implementing a research paper step-by-step from scratch. Use when asked to implement a paper, code up a paper, reproduce research results, or build a model from a paper. Focuses on building understanding through implementation with checkpoint questions.

View SKILL.md on GitHub Repository

Stars 3

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/48Nauts-Operator/opencode-baseline/tree/main/.opencode/skill/implement-paper-from-scratch

SKILL.md

Implement Paper From Scratch

The best way to truly understand a paper is to implement it. This skill guides you through that process methodically.

Philosophy

No copy-pasting from reference implementations - We build understanding, not just working code
Checkpoint questions verify understanding - You should be able to answer "why" at each step
Minimal dependencies - Use NumPy/PyTorch fundamentals, not high-level wrappers
Deliberate debugging - Bugs are learning opportunities, not obstacles

Process

Phase 1: Pre-Implementation Analysis

Before writing any code:

Identify the core algorithm - Strip away ablations, extensions, bells and whistles. What's the minimal version?
List the components - Break into modules:
- Data pipeline
- Model architecture
- Loss function(s)
- Training loop
- Evaluation metrics
Find the tricky parts - What's non-obvious?
- Custom layers or operations
- Numerical stability concerns
- Hyperparameter sensitivity
- Implementation details buried in appendices
Gather reference numbers - What should we expect?
- Training loss trajectory
- Validation metrics at convergence
- Compute requirements (if stated)

Phase 2: Scaffolded Implementation

Build up the implementation in this order:

Step 1: Data

python

# Start with synthetic/toy data
# Verify shapes and types before touching real data

Checkpoint: Can you describe what each tensor represents and its expected shape?

Step 2: Model Architecture

python

# Build layer by layer
# Print shapes at each stage
# Verify parameter counts match paper

Checkpoint: If you randomly initialize and do a forward pass, do the output shapes match what the paper describes?

Step 3: Loss Function

python

# Implement exactly as described
# Test with known inputs/outputs
# Check gradient flow

Checkpoint: Can you explain each term in the loss and why it's there?

Step 4: Training Loop

python

# Minimal loop first (no logging, checkpointing, etc.)
# Verify loss decreases on tiny overfit test
# Then add bells and whistles

Checkpoint: Can you overfit a single batch? If not, something is broken.

Step 5: Evaluation

python

# Implement paper's exact metrics
# Compare against reported numbers

Checkpoint: On the same data split, how close are you to paper's numbers?

Phase 3: The Debugging Gauntlet

When it doesn't work (and it won't at first):

The Overfit Test
- Can you memorize 1 example? 10? 100?
- If not, architecture or gradient bug
The Gradient Check
- Are gradients flowing to all parameters?
- Any NaN or exploding gradients?
The Initialization Check
- Match paper's initialization exactly
- This matters more than people think
The Learning Rate Sweep
- Log scale: 1e-5 to 1e-1
- Loss should decrease for some range
The Ablation Debug
- Remove components until it works
- Add back one at a time

Phase 4: Checkpoint Questions

At each stage, you should be able to answer:

Understanding:

Why does this component exist?
What would happen without it?
What alternatives were considered?

Implementation:

Why this specific implementation choice?
Where could numerical issues arise?
What's the computational complexity?

Debugging:

What would it look like if this was broken?
How would you test this in isolation?
What are the most likely bugs?

Output Format

For each implementation session, provide:

markdown

## Today's Implementation Goal
[Specific component we're building]

## Prerequisites Check
- [ ] Previous components working
- [ ] Understand what we're building
- [ ] Know expected behavior

## Implementation

### Code
[Code blocks with extensive comments]

### Checkpoint Questions
1. [Question]
   <details><summary>Answer</summary>[Answer]</details>

2. [Question]
   <details><summary>Answer</summary>[Answer]</details>

### Verification Steps
- [ ] Test 1: [What to check]
- [ ] Test 2: [What to check]

### Common Bugs at This Stage
1. [Bug pattern]: [How to identify and fix]

## What's Next
[Preview of next component and how it connects]

Tips for Specific Paper Types

Transformer-based

Attention mask shapes are the #1 bug source
Verify positional encoding is applied correctly
Check layer norm placement (pre vs post)

RL/Policy Gradient

Sign errors in policy gradient are silent killers
Advantage normalization matters
Verify discount factor handling

Generative Models

KL term balancing is finicky
Check latent space distribution
Verify reconstruction looks reasonable before training

Computer Vision

Normalization (ImageNet stats, batch norm) is crucial
Data augmentation can make or break results
Verify input preprocessing matches paper exactly

Success Criteria

You're done when:

Numbers match - Within reasonable variance of paper's results
Understanding is deep - You can explain every line of code
You found the gotchas - You know what breaks and why
You could modify it - Confident to try your own variations

Anti-Patterns to Avoid

❌ Copying code you don't understand
❌ Skipping checkpoint questions
❌ Using pre-built components for core algorithm
❌ Ignoring discrepancies with paper
❌ Moving on before current step works

Maintainer

48Nauts-Operator Core maintainer

Source details

Full Name: 48Nauts-Operator/opencode-baseline
Branch: main
Path in repo: .opencode/skill/implement-paper-from-scratch

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

48Nauts-Operator/opencode-baseline

file-organizer

Organize files and folders intelligently with duplicate detection

3 0

Explore

48Nauts-Operator/opencode-baseline

nx-workspace-patterns

Configure and optimize Nx monorepo workspaces. Use when setting up Nx, configuring project boundaries, optimizing build caching, or implementing affected commands.

3 0

Explore

48Nauts-Operator/opencode-baseline

auth-implementation-patterns

Master authentication and authorization patterns including JWT, OAuth2, session management, and RBAC to build secure, scalable access control systems. Use when implementing auth systems, securing APIs, or debugging security issues.

3 0

Explore

48Nauts-Operator/opencode-baseline

sql-optimization-patterns

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

3 0

Explore

48Nauts-Operator/opencode-baseline

monorepo-management

Master monorepo management with Turborepo, Nx, and pnpm workspaces to build efficient, scalable multi-package repositories with optimized builds and dependency management. Use when setting up monorepos, optimizing builds, or managing shared dependencies.

3 0

Explore

48Nauts-Operator/opencode-baseline

git-advanced-workflows

Master advanced Git workflows including rebasing, cherry-picking, bisect, worktrees, and reflog to maintain clean history and recover from any situation. Use when managing complex Git histories, collaborating on feature branches, or troubleshooting repository issues.

3 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Implement Paper From Scratch

Philosophy

Process

Phase 1: Pre-Implementation Analysis

Phase 2: Scaffolded Implementation

Step 1: Data

Step 2: Model Architecture

Step 3: Loss Function

Step 4: Training Loop

Step 5: Evaluation

Phase 3: The Debugging Gauntlet

Phase 4: Checkpoint Questions

Output Format

Tips for Specific Paper Types

Transformer-based

RL/Policy Gradient

Generative Models

Computer Vision

Success Criteria

Anti-Patterns to Avoid

Recommended Agent Skills

file-organizer

nx-workspace-patterns

auth-implementation-patterns

sql-optimization-patterns

monorepo-management

git-advanced-workflows