Agent skill
taiga-api
Query the hosted Taiga API at taiga.ant.dev for job results, passrates, transcripts, and run evaluations. Use when user asks about Taiga jobs, problem scores, eval results, or needs to submit/check jobs.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/taiga-api
SKILL.md
Taiga API
Query the hosted Taiga evaluation platform API for job results, transcripts, and problem runs.
IMPORTANT: Use Python, Not Shell
Always use Python for Taiga API requests. Shell has env var + pipe bugs that strip cookie values.
Python helper to load cookie:
def get_cookie():
with open('/home/atondwal/dmodel/ant/taiga-worktree/.env') as f:
for line in f:
if line.startswith('TAIGA_IAP_COOKIE='):
return line.split('=', 1)[1].strip().strip('"')
IMPORTANT: Always Use Opus 4.5
When submitting jobs, ALWAYS use claude-opus-4-5-20251101 as the model. Never use Sonnet or other models unless explicitly requested.
Authentication
Cookie stored in ~/dmodel/ant/taiga-worktree/.env. Uses __Host- prefix (session-only). If auth fails, ask user to refresh from browser DevTools → Network → copy Cookie header.
Making Requests
import urllib.request, json
def taiga_get(endpoint):
cookie = get_cookie() # see helper above
req = urllib.request.Request(f"https://taiga.ant.dev/api{endpoint}")
req.add_header('Cookie', cookie)
return json.loads(urllib.request.urlopen(req).read())
# Example: get job problems
data = taiga_get(f"/jobs/{job_id}/problems")
API Reference
Full docs at: https://taiga.ant.dev/api/docs
Jobs (Most Common)
| Endpoint | Method | Purpose |
|---|---|---|
/jobs |
GET | List all jobs |
/jobs?environment_id={id} |
GET | List jobs for environment |
/jobs/{job_id} |
GET | Get job details |
/jobs/{job_id}/problems |
GET | Get problem results (passrates, scores) |
/jobs/{job_id}/problems/stream |
GET | Stream problem results |
/jobs/{job_id}/error-summary |
GET | Get error summary |
/jobs |
POST | Create job with problems |
/cancel-job/{job_id} |
POST | Cancel running job |
/resubmit-problem/{job_id}/{problem_id} |
POST | Resubmit specific problem |
Transcripts
| Endpoint | Method | Purpose |
|---|---|---|
/transcript/{problem_run_id} |
GET | Get full transcript |
/transcript/stream/{problem_run_id} |
GET | Stream transcript |
Problem Runs
| Endpoint | Method | Purpose |
|---|---|---|
/problem_runs/{problem_id} |
GET | List runs for problem |
/problem-runs/{id}/container-logs |
GET | Get container logs |
/problem-runs/{id}/mcp-server-logs |
GET | Get MCP server logs |
/problem-runs/{id}/download-output |
GET | Download output directory |
Environments
| Endpoint | Method | Purpose |
|---|---|---|
/environments |
GET | List environments |
/environments/{id} |
GET | Get environment details |
/environments?skip=0&limit=100 |
GET | Paginated list |
Problems
| Endpoint | Method | Purpose |
|---|---|---|
/problems/{problem_id}/attempts |
GET | Get problem attempts |
/problems/versions/{version_id} |
GET | Get problem version |
/problems/versions/{version_id}/run |
POST | Run problem version |
/problem-crud |
GET | List all problems |
/problem-crud/stats/pass-rates |
POST | Get pass rate stats |
Docker Images
| Endpoint | Method | Purpose |
|---|---|---|
/docker-images |
GET | List docker images |
/docker-images/{id}/download |
GET | Download image source |
Common Workflows
Get Passrates for a Job
job_id = "3c300cca-707a-4e92-ac71-5688165f9ae1" # from URL ?id= param
data = taiga_get(f"/jobs/{job_id}/problems")
for r in data:
print(f"{r['problem_id']}: {r['final_score']}")
Aggregate Passrates
from collections import defaultdict
job_id = "YOUR_JOB_ID"
data = taiga_get(f"/jobs/{job_id}/problems")
problems = defaultdict(list)
for r in data:
problems[r['problem_id']].append(r['final_score'])
total_pass = total_runs = 0
for pid, scores in sorted(problems.items()):
passed = sum(1 for s in scores if s == 1.0)
total = len(scores)
total_pass += passed
total_runs += total
print(f"{pid}: {passed}/{total} ({100*passed/total:.0f}%)")
print(f"\nOverall: {total_pass}/{total_runs} ({100*total_pass/total_runs:.1f}%)")
Get Transcript
problem_run_id = "118ed21a-9864-4c8c-b34b-d92428f1c22a"
transcript = taiga_get(f"/transcript/{problem_run_id}")
List Jobs for Environment
env_id = "8e646c11-1461-44a4-9e8d-e3800a02ba07"
jobs = taiga_get(f"/jobs?environment_id={env_id}")
for j in jobs:
print(f"{j['id']}: {j['status']}")
Check Job Status
job = taiga_get(f"/jobs/{job_id}")
print(f"Status: {job['status']}, Completed: {job.get('completed_count')}")
Create a Job
import urllib.request, json
with open('problems-metadata.json') as f:
problems = json.load(f)
payload = {
"name": "my-job-name",
"problems_metadata": problems,
"n_attempts_per_problem": 10,
"api_model_name": "claude-opus-4-5-20251101" # ALWAYS use Opus 4.5
}
cookie = get_cookie()
req = urllib.request.Request(
"https://taiga.ant.dev/api/jobs",
data=json.dumps(payload).encode(),
headers={"Cookie": cookie, "Content-Type": "application/json"}
)
resp = json.loads(urllib.request.urlopen(req).read())
print(f"Job ID: {resp.get('job_id')}")
Response Schemas
Problem Run
{
"id": "118ed21a-...",
"problem_id": "sort-unique",
"attempt_number": 1,
"final_score": 1.0,
"status": "completed",
"subscores": {"matched_solution": 1.0},
"weights": {"matched_solution": 1.0},
"execution_time_ms": 467000,
"total_tokens": 34205
}
Job
{
"id": "3c300cca-...",
"status": "completed",
"environment_id": "8e646c11-...",
"api_model_name": "claude-opus-4-5-20251101",
"created_at": "2025-11-24T17:46:30Z"
}
URL Patterns
From Taiga web UI URLs:
- Job page:
https://taiga.ant.dev/job?id={job_id}&environmentId={env_id} - Transcripts:
https://taiga.ant.dev/transcripts?id={job_id}&problemId={problem_id}&...
The id parameter in URLs is the job_id.
Tips
- Use Python with
urllib.request- avoid shell due to env var bugs - Cookie expires periodically - refresh from browser if auth fails
/jobs/{id}/problemsis the main endpoint for checking pass rates- For streaming large responses, use the
/streamvariants
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?