Agent skill
data-extraction-patterns
Common patterns for extracting analytics data from GA4 and GSC with API handling
Install this agent skill to your Project
npx add-skill https://github.com/MadAppGang/claude-code/tree/main/plugins/seo/skills/data-extraction-patterns
SKILL.md
plugin: seo updated: 2026-01-20
Data Extraction Patterns
When to Use
- Setting up analytics data pipelines
- Combining data from multiple sources
- Handling API rate limits and errors
- Caching frequently accessed data
- Building data collection workflows
API Reference
Google Analytics 4 (GA4)
MCP Server: mcp-server-google-analytics
Key Operations:
get_report({
propertyId: "properties/123456789",
dateRange: { startDate: "30daysAgo", endDate: "today" },
dimensions: ["pagePath", "date"],
metrics: ["screenPageViews", "averageSessionDuration", "bounceRate"]
})
Useful Metrics:
| Metric | Description | Use Case |
|---|---|---|
| screenPageViews | Total page views | Traffic volume |
| sessions | User sessions | Visitor count |
| averageSessionDuration | Avg time in session | Engagement |
| bounceRate | Single-page visits | Content quality |
| engagementRate | Engaged sessions % | True engagement |
| scrolledUsers | Users who scrolled | Content consumption |
Useful Dimensions:
| Dimension | Description |
|---|---|
| pagePath | URL path |
| date | Date (for trending) |
| sessionSource | Traffic source |
| deviceCategory | Desktop/mobile/tablet |
Google Search Console (GSC)
MCP Server: mcp-server-gsc
Key Operations:
search_analytics({
siteUrl: "https://example.com",
startDate: "2025-11-27",
endDate: "2025-12-27",
dimensions: ["query", "page"],
rowLimit: 1000
})
get_url_inspection({
siteUrl: "https://example.com",
inspectionUrl: "https://example.com/page"
})
Available Metrics:
| Metric | Description | Use Case |
|---|---|---|
| clicks | Total clicks from search | Traffic from Google |
| impressions | Times shown in results | Visibility |
| ctr | Click-through rate | Snippet effectiveness |
| position | Average ranking | SEO success |
Dimensions:
| Dimension | Description |
|---|---|
| query | Search query |
| page | Landing page URL |
| country | User country |
| device | Desktop/mobile/tablet |
| date | Date (for trending) |
Parallel Execution Pattern
Optimal Data Fetch (All Sources)
## Parallel Data Fetch Pattern
When fetching from multiple sources, issue all requests in a SINGLE message
for parallel execution:
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE 1: Parallel Data Requests │
├─────────────────────────────────────────────────────────────────┤
│ │
│ [MCP Call 1]: google-analytics.get_report(...) │
│ [MCP Call 2]: google-search-console.search_analytics(...) │
│ │
│ → All execute simultaneously │
│ → Results return when all complete │
│ → ~2x faster than sequential │
│ │
└─────────────────────────────────────────────────────────────────┘
Sequential (When Needed)
Some operations require sequential execution:
## Sequential Pattern (Dependencies)
When one request depends on another's result:
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE 1: Get list of pages │
│ → Returns: ["/page1", "/page2", "/page3"] │
├─────────────────────────────────────────────────────────────────┤
│ MESSAGE 2: Get details for each page │
│ → Uses page list from Message 1 │
│ → Can parallelize within this message │
└─────────────────────────────────────────────────────────────────┘
Rate Limiting
API Rate Limits
| API | Limit | Strategy |
|---|---|---|
| GA4 | 10 QPS per property | Batch dimensions |
| GSC | 1,200 requests/min | Paginate large exports |
Retry Pattern
#!/bin/bash
# Retry with exponential backoff
MAX_RETRIES=3
RETRY_DELAY=5
fetch_with_retry() {
local url="$1"
local attempt=1
while [ $attempt -le $MAX_RETRIES ]; do
response=$(curl -s -w "%{http_code}" -o /tmp/response.json "$url")
http_code="${response: -3}"
if [ "$http_code" = "200" ]; then
cat /tmp/response.json
return 0
elif [ "$http_code" = "429" ]; then
echo "Rate limited, waiting ${RETRY_DELAY}s..." >&2
sleep $RETRY_DELAY
RETRY_DELAY=$((RETRY_DELAY * 2))
else
echo "Error: HTTP $http_code" >&2
return 1
fi
attempt=$((attempt + 1))
done
echo "Max retries exceeded" >&2
return 1
}
Caching Pattern
Session-Based Cache
# Cache structure
SESSION_PATH="/tmp/seo-performance-20251227-143000-example"
CACHE_DIR="${SESSION_PATH}/cache"
CACHE_TTL=3600 # 1 hour in seconds
mkdir -p "$CACHE_DIR"
# Cache key generation
cache_key() {
echo "$1" | md5sum | cut -d' ' -f1
}
# Check cache
get_cached() {
local key=$(cache_key "$1")
local cache_file="${CACHE_DIR}/${key}.json"
if [ -f "$cache_file" ]; then
local age=$(($(date +%s) - $(stat -f%m "$cache_file" 2>/dev/null || stat -c%Y "$cache_file")))
if [ $age -lt $CACHE_TTL ]; then
cat "$cache_file"
return 0
fi
fi
return 1
}
# Save to cache
save_cache() {
local key=$(cache_key "$1")
local cache_file="${CACHE_DIR}/${key}.json"
cat > "$cache_file"
}
# Usage
CACHE_KEY="ga4_${URL}_${DATE_RANGE}"
if ! RESULT=$(get_cached "$CACHE_KEY"); then
RESULT=$(fetch_from_api)
echo "$RESULT" | save_cache "$CACHE_KEY"
fi
Date Range Standardization
Common Date Ranges
# Standard date range calculations
TODAY=$(date +%Y-%m-%d)
case "$RANGE" in
"7d")
START_DATE=$(date -v-7d +%Y-%m-%d 2>/dev/null || date -d "7 days ago" +%Y-%m-%d)
;;
"30d")
START_DATE=$(date -v-30d +%Y-%m-%d 2>/dev/null || date -d "30 days ago" +%Y-%m-%d)
;;
"90d")
START_DATE=$(date -v-90d +%Y-%m-%d 2>/dev/null || date -d "90 days ago" +%Y-%m-%d)
;;
"mtd")
START_DATE=$(date +%Y-%m-01)
;;
"ytd")
START_DATE=$(date +%Y-01-01)
;;
esac
END_DATE="$TODAY"
API-Specific Formats
| API | Format | Example |
|---|---|---|
| GA4 | Relative or ISO | "30daysAgo", "2025-12-01" |
| GSC | ISO 8601 | "2025-12-01" |
Graceful Degradation
Data Source Fallback
## Fallback Strategy
When a data source is unavailable:
┌─────────────────────────────────────────────────────────────────┐
│ PRIMARY SOURCE │ FALLBACK │ LAST RESORT │
├──────────────────────┼─────────────────────┼────────────────────┤
│ GA4 traffic data │ GSC clicks │ Estimate from GSC │
│ GSC search perf │ Manual SERP check │ WebSearch SERP │
│ CWV (CrUX) │ PageSpeed API │ Lighthouse CLI │
└──────────────────────┴─────────────────────┴────────────────────┘
Partial Data Output
## Analysis Report (Partial Data)
### Data Availability
| Source | Status | Impact |
|--------|--------|--------|
| GA4 | NOT CONFIGURED | Missing engagement metrics |
| GSC | AVAILABLE | Full search data |
### Analysis Notes
This analysis is based on limited data sources:
- Search performance metrics are complete (GSC)
- Engagement metrics unavailable (no GA4)
**Recommendation**: Configure GA4 for complete analysis.
Run `/setup-analytics` to add Google Analytics.
Unified Data Model
Combined Output Structure
{
"metadata": {
"url": "https://example.com/page",
"fetchedAt": "2025-12-27T14:30:00Z",
"dateRange": {
"start": "2025-11-27",
"end": "2025-12-27"
}
},
"sources": {
"ga4": {
"available": true,
"metrics": {
"pageViews": 2450,
"avgTimeOnPage": 222,
"bounceRate": 38.2,
"engagementRate": 64.5
}
},
"gsc": {
"available": true,
"metrics": {
"impressions": 15200,
"clicks": 428,
"ctr": 2.82,
"avgPosition": 4.2
},
"topQueries": [
{"query": "seo guide", "clicks": 156, "position": 4}
]
}
},
"computed": {
"healthScore": 72,
"status": "GOOD"
}
}
Error Handling
Common Errors
| Error | Cause | Resolution |
|---|---|---|
| 401 Unauthorized | Invalid/expired credentials | Re-run /setup-analytics |
| 403 Forbidden | Missing permissions | Check API access in console |
| 429 Too Many Requests | Rate limit | Wait and retry with backoff |
| 404 Not Found | Invalid property/site | Verify IDs in configuration |
| 500 Server Error | API issue | Retry later, check status page |
Error Output Pattern
## Data Fetch Error
**Source**: Google Analytics 4
**Error**: 403 Forbidden
**Message**: "User does not have sufficient permissions for this property"
### Troubleshooting Steps
1. Verify Service Account email in GA4 Admin
2. Ensure "Viewer" role is granted
3. Check Analytics Data API is enabled
4. Wait 5 minutes for permission propagation
### Workaround
Proceeding with available data sources (GSC).
GA4 engagement metrics will not be included in this analysis.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
test-skill
A test skill for validation testing. Use when testing skill parsing and validation logic.
bad-skill
claudish-usage
CRITICAL - Guide for using Claudish CLI ONLY through sub-agents to run Claude Code with OpenRouter models (Grok, GPT-5, Gemini, MiniMax). NEVER run Claudish directly in main context unless user explicitly requests it. Use when user mentions external AI models, Claudish, OpenRouter, or alternative models. Includes mandatory sub-agent delegation patterns, agent selection guide, file-based instructions, and strict rules to prevent context window pollution.
release
Plugin release process for MAG Claude Plugins marketplace. Covers version bumping, marketplace.json updates, git tagging, and common mistakes. Use when releasing new plugin versions or troubleshooting update issues.
claudish-integration
openrouter-trending-models
Fetch trending programming models from OpenRouter rankings. Use when selecting models for multi-model review, updating model recommendations, or researching current AI coding trends. Provides model IDs, context windows, pricing, and usage statistics from the most recent week.
Didn't find tool you were looking for?