Agent skills
blog-cannibalization

Agent skill

blog-cannibalization

Detect keyword cannibalization across blog posts by extracting primary keywords from titles and headings, clustering semantically similar targets, and flagging posts competing for the same search intent. Supports local-only mode (grep-based) and DataForSEO API mode (Page Intersection endpoint at ~$0.01/call). Outputs severity-scored report with merge or differentiate recommendations. Use when user says "cannibalization", "keyword overlap", "competing pages", "duplicate keywords", "cannibalize".

View SKILL.md on GitHub Repository

Stars 463

Forks 105

Install this agent skill to your Project

npx add-skill https://github.com/AgriciDaniel/claude-blog/tree/main/skills/blog-cannibalization

SKILL.md

Blog Cannibalization - Keyword Overlap Detection

Detect when multiple blog posts compete for the same search keywords. Two modes: local-only analysis (default) and DataForSEO API mode for SERP-level data.

Two Modes

Mode	Flag	Cost	Data Source
Local	(default)	Free	File content analysis via Grep/Read
API	`--api`	~$0.01/call	DataForSEO Page Intersection + Ranked Keywords

Local mode works without any API keys. API mode requires DataForSEO credentials set as environment variables: DATAFORSEO_LOGIN and DATAFORSEO_PASSWORD.

Local Mode Workflow

Step 1: Scan Blog Files

Use Glob to find all content files in the target directory:

Patterns: **/*.md, **/*.mdx, **/*.html
Skip files in node_modules/, .git/, drafts/

Step 2: Extract Primary Keywords

For each file, read and extract keyword signals from:

Title tag or H1 heading (highest weight)
H2 headings (medium weight)
First paragraph (supporting signal)
Meta description if present in frontmatter

Primary keyword extraction method:

Tokenize title and H1 into 1-gram, 2-gram, and 3-gram phrases
Score each phrase by frequency across title + H2s + first paragraph
Select the top-scoring 2-3 word phrase as the primary keyword
Record secondary keywords from H2 headings

Step 3: Cluster by Similarity

Group posts into clusters using these matching rules (in priority order):

Exact match - identical primary keyword across 2+ posts
Stem match - same root word (e.g., "optimize" vs "optimization")
Semantic overlap - Claude determines that two keywords target the same search intent (e.g., "best CRM software" vs "top CRM tools 2026")
Subset match - one keyword contains another (e.g., "email marketing" vs "email marketing for startups")

Step 4: Score and Flag

For each cluster with 2+ posts, assess severity and generate a recommendation.

Step 5: Output Report

Display the results table and per-cluster recommendations.

API Mode Workflow (DataForSEO)

Requires the --api flag. Uses WebFetch to call DataForSEO endpoints.

Endpoints Used

Page Intersection - find keywords where multiple URLs rank:

POST https://api.dataforseo.com/v3/dataforseo_labs/google/page_intersection/live
Authorization: Basic <base64(login:password)>

{
  "pages": {
    "1": "https://example.com/post-a",
    "2": "https://example.com/post-b"
  },
  "language_code": "en",
  "location_code": 2840
}

Cost: ~$0.01 per call. Returns overlapping keywords with position, volume, CPC.

Ranked Keywords - get all keywords a single URL ranks for:

POST https://api.dataforseo.com/v3/dataforseo_labs/google/ranked_keywords/live

{
  "target": "https://example.com/post-a",
  "language_code": "en",
  "location_code": 2840
}

API Analysis Steps

Collect all published URLs from the user (or sitemap)
Run Ranked Keywords for each URL to build keyword profiles
Run Page Intersection for URL pairs that share keyword clusters
Calculate severity using the formula below
Output enriched report with search volume and position data

Severity Scoring

Four severity levels based on overlap signals:

Level	Criteria	Action Urgency
Critical	Same exact keyword, both pages in top 20	Immediate
High	Same keyword cluster, one page outranks the other	This week
Medium	Related keywords with partial SERP overlap	This month
Low	Semantic similarity but different confirmed intents	Monitor

Severity Formula (API Mode)

severity_score = overlap_count x avg_search_volume x (1 / position_gap)

Where:

overlap_count = number of shared ranking keywords
avg_search_volume = mean monthly volume of shared keywords
position_gap = absolute difference in average ranking position (min 1)

Higher score = more urgent cannibalization problem.

Severity Heuristic (Local Mode)

Without SERP data, use a simplified scoring:

Critical: Exact primary keyword match between posts
High: Stem match on primary keyword, or 3+ shared H2 keywords
Medium: Semantic overlap on primary keyword
Low: Subset match only, or shared secondary keywords

Output Format

Summary Table

| Post A | Post B | Shared Keywords | Severity | Recommendation |
|--------|--------|-----------------|----------|----------------|
| /best-crm-tools | /top-crm-software | best crm, crm tools, crm software | Critical | MERGE |
| /email-tips | /email-marketing-guide | email marketing | High | DIFFERENTIATE |
| /seo-basics | /seo-for-beginners | seo basics, beginner seo | Critical | CANONICAL |
| /react-hooks | /react-state-mgmt | react, state | Low | NO ACTION |

Per-Cluster Detail

For each flagged cluster, provide:

Both post titles and URLs
Full list of overlapping keywords (with volume if API mode)
Which post is stronger (more comprehensive, better structured)
Specific recommendation with rationale

Recommendations

Four possible actions for each cannibalization cluster:

MERGE

When both pages are thin or cover the same intent with similar depth.

Combine the best content from both into one comprehensive post
301 redirect the weaker URL to the merged post
Preserve all internal links pointing to either URL

DIFFERENTIATE

When pages serve different intents but keyword targeting overlaps.

Shift the primary keyword of the weaker post to a related long-tail
Update the title, H1, and meta description to reflect the new focus
Add internal links between the two posts to signal distinct topics

CANONICAL

When one post is clearly the authority and the other is a lesser duplicate.

Add rel="canonical" on the weaker page pointing to the authority
Consider noindexing the weaker page if it adds no unique value
Link from the weaker page to the authority page

NO ACTION

When intent is genuinely different despite surface-level keyword similarity.

Document the reasoning for future audits
Monitor rankings quarterly for any position changes
Re-evaluate if either post drops in rankings

Error Handling

No blog files found: If the directory contains no .md, .mdx, or .html files, report "No blog files found in [directory]" and suggest checking the path
DataForSEO credentials missing: In API mode, if credentials are not configured, fall back to local mode automatically and notify the user
API rate limits: DataForSEO has per-minute rate limits. If a 429 response is received, wait and retry once. If it persists, switch to local mode for remaining URLs
WebFetch failures: If a source URL is unreachable, skip it and note "Unable to verify - source unavailable" in the report
Single-post directory: If only one blog post exists, report "Cannibalization analysis requires at least 2 posts" and exit gracefully

Maintainer

AgriciDaniel Core maintainer

Source details

Full Name: AgriciDaniel/claude-blog
Branch: main
Path in repo: skills/blog-cannibalization
License: MIT License
Topics: ai claude-code claude-code-skill open-source seo content-creation ai-content

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

AgriciDaniel/claude-blog

blog-brief

Generate detailed content briefs for blog posts with target keywords, content outlines, competitive analysis, recommended statistics, image and chart suggestions, word count targets, internal linking architecture, template recommendations (12 types), TL;DR drafts, citation capsule planning, information gain prompts, and multi-channel distribution plans. Briefs are optimized for Google rankings and AI citations (GEO/AEO). Use when user says "content brief", "blog brief", "write brief", "outline blog", "plan blog post", "blog outline", "content outline".

463 105

Explore

AgriciDaniel/claude-blog

blog

Full-lifecycle blog engine with 21 commands, 12 content templates, 5-category 100-point scoring, and 4 specialized agents. Optimized for Google rankings (December 2025 Core Update, E-E-A-T) and AI citations (GEO/AEO). Writes, rewrites, analyzes, outlines, audits, and repurposes blog content with answer-first formatting, sourced statistics, Pixabay/Unsplash/Pexels images, AI image generation via Gemini, built-in SVG chart generation, JSON-LD schema generation, and freshness signals. Supports any platform (WordPress, Next.js MDX, Hugo, Ghost, Astro, Jekyll, 11ty, Gatsby, HTML). Use when user says "blog", "write blog", "blog post", "blog strategy", "content brief", "editorial calendar", "analyze blog", "rewrite blog", "update blog", "blog SEO", "blog optimization", "content plan", "blog outline", "seo check", "schema markup", "repurpose", "geo audit", "blog audit", "citation readiness".

463 105

Explore

AgriciDaniel/claude-blog

blog-persona

Create and manage writing personas with NNGroup 4-dimension tone framework (Funny-Serious, Formal-Casual, Respectful-Irreverent, Enthusiastic-Matter-of-fact). Personas define readability targets, sentence length distribution, vocabulary tier, contraction frequency, and summary box label. Used by blog-write and blog-rewrite to enforce consistent voice. Use when user says "persona", "voice", "tone", "writing style", "brand voice", "create persona", "use persona".

463 105

Explore

AgriciDaniel/claude-blog

blog-analyze

Audit and score blog posts on a 5-category 100-point scoring system covering content quality, SEO optimization, E-E-A-T signals, technical elements, and AI citation readiness. Includes AI content detection (burstiness, phrase flagging, vocabulary diversity). Supports export formats (markdown, JSON, table) and batch analysis with sorting. Generates prioritized recommendations (Critical/High/Medium/Low) with specific fixes. Works with any format (MDX, markdown, HTML, URL). Use when user says "analyze blog", "audit blog", "blog score", "check blog quality", "blog review", "rate this blog", "blog health check".

463 105

Explore

AgriciDaniel/claude-blog

blog-taxonomy

Extract, suggest, and sync tags and categories for blog posts across all major CMS platforms. Supports WordPress REST API, Shopify GraphQL, Ghost Content API, Strapi REST/GraphQL, and Sanity GROQ. Generates tag suggestions from content analysis (keyword frequency, heading extraction, semantic grouping), enforces minimum post-count thresholds to prevent thin tag archives, and syncs taxonomy via authenticated API calls. Use when user says "tags", "categories", "taxonomy", "tag suggestions", "sync tags", "WordPress tags", "Shopify tags".

463 105

Explore

AgriciDaniel/claude-blog

blog-audit

Full-site blog health assessment scanning all blog files for quality scores, orphan pages, topic cannibalization, stale content, and AI citation readiness. Spawns parallel subagents for comprehensive analysis. Produces per-post scores and a prioritized action queue. Use when user says "audit blog", "blog audit", "site audit", "blog health", "audit all posts", "check all blogs".

463 105

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Blog Cannibalization - Keyword Overlap Detection

Two Modes

Local Mode Workflow

Step 1: Scan Blog Files

Step 2: Extract Primary Keywords

Step 3: Cluster by Similarity

Step 4: Score and Flag

Step 5: Output Report

API Mode Workflow (DataForSEO)

Endpoints Used

API Analysis Steps

Severity Scoring

Severity Formula (API Mode)

Severity Heuristic (Local Mode)

Output Format

Summary Table

Per-Cluster Detail

Recommendations

MERGE

DIFFERENTIATE

CANONICAL

NO ACTION

Error Handling

Recommended Agent Skills

blog-brief

blog

blog-persona

blog-analyze

blog-taxonomy

blog-audit