Agent skill
policyengine-api
PolicyEngine API - Flask REST service powering policyengine.org and programmatic access
Install this agent skill to your Project
npx add-skill https://github.com/PolicyEngine/policyengine-claude/tree/main/skills/tools-and-apis/policyengine-api-skill
SKILL.md
PolicyEngine API
The PolicyEngine API is a Flask-based REST service that provides tax and benefit calculations for the web app and programmatic users.
For Users
What is the API?
When you use policyengine.org, the API processes your calculations on our servers.
API base: https://api.policyengine.org
What it does:
- Runs tax and benefit calculations
- Stores and retrieves policy reforms
- Computes population-wide impacts
- Serves parameter and variable metadata
Public Access
The API is publicly accessible with rate limits:
- Unauthenticated: 100 requests/minute
- Authenticated: 1,000 requests/minute
Try it:
curl https://api.policyengine.org/us/policy/2
API Documentation
OpenAPI spec: https://api.policyengine.org/docs
Interactive docs: Swagger UI at API docs endpoint
For Analysts
Using the API
Option 1: Python client (recommended)
# Use the policyengine package
# See policyengine-python-client-skill
Option 2: Direct API calls
import requests
# Calculate household impact
response = requests.post(
"https://api.policyengine.org/us/calculate",
json={
"household": household_situation,
"policy_id": None # or reform_id
}
)
result = response.json()
Key Endpoints
Household calculations:
POST /us/calculate
POST /uk/calculate
Policy management:
GET /us/policy/{policy_id}
POST /us/policy
Economy impacts:
GET /us/economy/{policy_id}/over/{baseline_policy_id}
Metadata:
GET /us/parameters
GET /us/variables
GET /us/parameter/{parameter_name}
GET /us/variable/{variable_name}
Rate Limits and Performance
Rate limits:
- 100 req/min (unauthenticated)
- 1,000 req/min (authenticated - contact team)
Response times:
- Household calculation: ~200-500ms
- Population impact: ~5-30 seconds
- Cached results: <100ms
Optimization:
- Use the same policy_id for multiple requests (caching)
- Batch calculations when possible
- Use webhooks for long-running jobs (population impacts)
For Contributors
Repository
Location: PolicyEngine/policyengine-api
Clone:
git clone https://github.com/PolicyEngine/policyengine-api
cd policyengine-api
Current Architecture
To see current structure:
tree policyengine_api/
# Key directories:
ls policyengine_api/
# - endpoints/ - HTTP endpoint handlers
# - routes/ - Route registration
# - services/ - Business logic
# - compute_api/ - Calculation services
# - economy_api/ - Economy impact calculations
# - utils/ - Helpers (caching, validation)
# - data/ - Static data
Current Implementation Patterns
Reference endpoint (read this first):
cat policyengine_api/endpoints/economy.py
This demonstrates:
- Standard endpoint structure
- Request validation
- Caching pattern
- Error handling
- Response formatting
To find other endpoints:
ls policyengine_api/endpoints/
# - household.py
# - policy.py
# - economy.py
# - metadata.py
# - etc.
Standard Endpoint Pattern (Stable)
from flask import Blueprint, request, jsonify
from policyengine_api.utils import cache
blueprint = Blueprint("my_endpoint", __name__)
@blueprint.route("/us/calculate", methods=["POST"])
def calculate():
"""Standard pattern: validate, cache-check, compute, cache, return."""
try:
# 1. Get and validate input
data = request.json
if not data:
return jsonify({"error": "No data provided"}), 400
# 2. Generate cache key
cache_key = f"calc_{hash(str(data))}"
# 3. Check cache
cached = cache.get(cache_key)
if cached:
return jsonify(cached)
# 4. Compute
result = perform_calculation(data)
# 5. Cache result
cache.set(cache_key, result, expire=3600)
# 6. Return
return jsonify(result)
except Exception as e:
return jsonify({"error": str(e), "status": "error"}), 500
Current implementation details:
# See actual endpoint for current pattern
cat policyengine_api/endpoints/household.py
Caching Strategy
To see current caching implementation:
# Redis configuration
cat policyengine_api/utils/cache.py
# Find cache usage
grep -r "cache\." policyengine_api/endpoints/
Pattern:
- Redis for caching
- Cache keys based on inputs
- TTL varies by endpoint (1 hour to 1 day)
- Clear cache on parameter changes
Background Jobs
For long-running calculations (population impacts):
To see current implementation:
# RQ (Redis Queue) usage
grep -r "@job" policyengine_api/
# Job patterns
cat policyengine_api/economy_api/
Pattern:
- Use RQ for jobs > 5 seconds
- Return job_id immediately
- Poll for completion
- Cache results
Country Integration
How API loads country packages:
cat policyengine_api/country.py
Pattern:
- Dynamically imports country packages
- Routes by country code (/us/, /uk/)
- Manages multiple model versions
Service Layer
Business logic separated from endpoints:
ls policyengine_api/services/
Pattern:
# endpoints/household.py
from policyengine_api.services import household_service
@app.route("/us/calculate", methods=["POST"])
def calculate():
result = household_service.calculate(data)
return jsonify(result)
# services/household_service.py
def calculate(data):
# Business logic here
simulation = create_simulation(data)
return simulation.calculate(...)
Dataset Selection Pattern
When to delegate to policyengine.py:
The API should return None for dataset selection in most cases, allowing policyengine.py to choose the appropriate default dataset. This creates better separation of concerns.
Pattern:
# In economy_service.py _setup_data() method:
# ❌ DON'T: Explicitly specify datasets the API shouldn't control
if region == "ny":
return "gs://policyengine-us-data/some_dataset.h5"
# ✅ DO: Return None to let policyengine.py choose the default
if region in US_STATES:
return None # policyengine.py handles state-specific datasets
# ✅ DO: Only specify datasets for special cases the API needs to control
if region == "nyc":
return "gs://policyengine-us-data/pooled_3_year_cps_2023.h5" # NYC exception
Why this matters:
- Keeps dataset logic centralized in policyengine.py where it belongs
- API doesn't need to know about state-specific dataset paths
- Easier to update dataset selection without API changes
- Only special cases (like NYC) should be explicitly specified in the API
When to see this pattern:
- Look at
policyengine_api/services/economy_service.py - Look for
_setup_data()method - Related to microsimulation and state-level calculations
Testing
To see current test patterns:
ls tests/
cat tests/test_household.py
Run tests:
make test
# Specific test
pytest tests/test_economy.py -v
# With coverage
make test-coverage
Development Server
Start locally:
make debug
Test endpoint:
curl http://localhost:5000/us/policy/2
Deployment
To see deployment configuration:
# Google Cloud Platform
cat app.yaml # App Engine config
cat cloudbuild.yaml # Cloud Build config
# Environment variables
cat .env.example
Current deployment:
- Google App Engine
- Cloud SQL (PostgreSQL)
- Redis (caching)
- Cloud Build (CI/CD)
API Versions
To see versioning strategy:
grep -r "version" policyengine_api/
Current approach:
- API version in URLs (may add /v1/ prefix)
- Country package versions independent
- Breaking changes rare (backwards compatible)
Architecture Diagrams
Request Flow
User/App → API Gateway → Flask App → Country Package → Core Engine
↓
Redis Cache
↓
Background Job (if needed)
↓
PostgreSQL (storage)
Dependencies
policyengine-core
↓
policyengine-us, policyengine-uk, etc.
↓
policyengine-api (you are here)
↓
policyengine-app (consumes API)
To understand dependencies:
- See
policyengine-core-skillfor engine patterns - See
policyengine-us-skillfor country model usage - See
policyengine-app-skillfor how app calls API
Common Development Tasks
Task 1: Add New Endpoint
-
Study reference implementation:
bashcat policyengine_api/endpoints/economy.py -
Create new endpoint file:
python# policyengine_api/endpoints/my_endpoint.py # Follow the pattern from economy.py -
Register route:
bash# See route registration cat policyengine_api/routes/__init__.py -
Add tests:
bash# Follow test pattern cat tests/test_economy.py
Task 2: Modify Caching Behavior
See current caching:
cat policyengine_api/utils/cache.py
Common changes:
- Adjust TTL (time to live)
- Change cache key generation
- Add cache invalidation
Task 3: Update Country Package Version
To see how versions are managed:
# Requirements
cat requirements.txt | grep policyengine-
# Update and deploy
# See deployment docs in README
Security and Best Practices
Input Validation
Always validate:
- Country code (us, uk, ca)
- Policy ID format
- Household structure
- Parameter values
See validation examples:
grep -r "validate" policyengine_api/endpoints/
Error Handling
Standard error response:
return jsonify({
"error": "Error message",
"details": additional_context,
"status": "error"
}), status_code
See error patterns:
grep -A 5 "jsonify.*error" policyengine_api/endpoints/
Logging
To see logging configuration:
cat policyengine_api/gcp_logging.py
Pattern:
- Google Cloud Logging
- Log all errors
- Log slow queries (>1s)
- Don't log sensitive data
Related Skills
- policyengine-python-client-skill - Using the API
- policyengine-core-skill - Understanding the engine
- policyengine-us-skill - Country model integration
- policyengine-app-skill - How app consumes API
- policyengine-standards-skill - Code quality
- policyengine-writing-skill - API documentation style
Resources
Repository: https://github.com/PolicyEngine/policyengine-api Live API: https://api.policyengine.org Documentation: https://api.policyengine.org/docs Status: https://status.policyengine.org
Didn't find tool you were looking for?