Agent skill

vom-algorithms

Implements and extends the Visual Object Model (VOM) algorithms for terminal UI element detection in agent-tui. Use when: (1) Modifying cli/src/vom/ segmentation or classification code, (2) Adding new UI element roles or detection patterns, (3) Implementing incremental updates or performance optimizations, (4) Working with terminal screen buffers, cell styles, or coordinate systems, (5) Debugging element detection issues, (6) Extending the VOM pipeline architecture.

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/vom-algorithms

SKILL.md

VOM Algorithms

Core Concepts

The Visual Object Model treats a terminal as a 2D grid of styled cells and identifies UI elements through a two-stage pipeline:

ScreenBuffer → Segmentation → Clusters → Classification → Components
   (cells)        (RLE)      (regions)   (heuristics)    (UI elements)

Key data structures (see cli/src/vom/mod.rs):

ScreenBuffer: 2D grid of Cell (char + style)
Cluster: Style-homogeneous text region with bounds
Component: Classified UI element with role and hash

Algorithm Selection Guide

Task	Reference
Modify segmentation logic	01-run-length-encoding.md
Add multi-row component detection	02-connected-component-labeling.md
Understand traversal order	03-raster-scan-traversal.md
Add/modify element role detection	04-heuristic-classification.md
Work with element positioning	05-bounding-box-computation.md
Debug terminal rendering	06-vt100-state-machine.md
Implement element tracking	07-content-hashing.md
Refactor tokenization	08-lexical-analysis.md
Add pattern matchers	09-pattern-matching.md
Handle wide/emoji chars	10-unicode-terminal-handling.md
Fix coordinate issues	11-grid-coordinate-systems.md
Optimize updates	12-incremental-updates.md
Understand full pipeline	13-vom-pipeline-architecture.md
Implement click targeting	14-hit-testing-click-targeting.md

Quick Implementation Patterns

Adding a New Role

Add variant to Role enum in cli/src/vom/mod.rs
Add detection function in cli/src/vom/classifier.rs
Insert in priority order within infer_role()
Add tests

rust

// classifier.rs
fn is_progress_bar(text: &str) -> bool {
    let bar_chars = ['█', '▓', '▒', '░', '─', '━'];
    let count = text.chars().filter(|c| bar_chars.contains(c)).count();
    count > text.len() / 2
}

fn infer_role(cluster: &Cluster, cursor_row: u16, cursor_col: u16) -> Role {
    // ... existing checks ...
    if is_progress_bar(&cluster.text) {
        return Role::ProgressBar;
    }
    // ... rest of cascade ...
}

Modifying Segmentation

Read 01-run-length-encoding.md first. Key file: cli/src/vom/segmentation.rs

Current predicate: style equality. To change grouping logic:

rust

fn should_merge(current: &Cluster, cell: &Cell) -> bool {
    current.style == cell.style
    // Add additional conditions here
}

Implementing Element Tracking

Read 07-content-hashing.md and 12-incremental-updates.md.

rust

// Track elements across frames
let prev_hash = component.visual_hash;
// After re-segmentation, find by hash:
let same_element = new_components.iter().find(|c| c.visual_hash == prev_hash);

Code Locations

Concept	File
Terminal emulation	`cli/src/terminal.rs`
Segmentation	`cli/src/vom/segmentation.rs`
Classification	`cli/src/vom/classifier.rs`
Data types	`cli/src/vom/mod.rs`
Snapshot command	`cli/src/handlers.rs`

Complexity Targets

Segmentation: O(W×H) single pass
Classification: O(clusters) with O(text_len) per cluster
Full snapshot: < 5ms for 80×24 terminal

Testing Patterns

rust

#[test]
fn test_new_element_detection() {
    let cluster = make_cluster("█████░░░░░", CellStyle::default(), 0, 0);
    let role = infer_role(&cluster, 99, 99);
    assert_eq!(role, Role::ProgressBar);
}

Always test:

Positive detection (element recognized)
Negative cases (similar but different elements)
Boundary conditions (edge of screen, empty text)
Style variations (bold, inverse, colored)

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/vom-algorithms
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

VOM Algorithms

Core Concepts

Algorithm Selection Guide

Quick Implementation Patterns

Adding a New Role

Modifying Segmentation

Implementing Element Tracking

Code Locations

Complexity Targets

Testing Patterns

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state