Agent skill

z-ai-api

Z.ai API integration for building applications with GLM models. Use when working with Z.ai/ZhipuAI APIs for: (1) Chat completions with GLM-4.7/4.6/4.5 models, (2) Vision/multimodal tasks with GLM-4.6V, (3) Image generation with GLM-Image or CogView-4, (4) Video generation with CogVideoX-3 or Vidu models, (5) Audio transcription with GLM-ASR-2512, (6) Function calling and tool use, (7) Web search integration, (8) Translation, slide/poster generation agents. Triggers: Z.ai, ZhipuAI, GLM, BigModel, Zhipu, CogVideoX, CogView, Vidu.

View SKILL.md on GitHub Repository

Stars 1

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/jrajasekera/claude-skills/tree/main/skills/z-ai-api

SKILL.md

Z.ai API Skill

Quick Reference

Base URL: https://api.z.ai/api/paas/v4 Coding Plan URL: https://api.z.ai/api/coding/paas/v4 Auth: Authorization: Bearer YOUR_API_KEY

Core Endpoints

Endpoint	Purpose
`/chat/completions`	Text/vision chat
`/images/generations`	Image generation
`/videos/generations`	Video generation (async)
`/audio/transcriptions`	Speech-to-text
`/web_search`	Web search
`/async-result/{id}`	Poll async tasks
`/v1/agents`	Translation, slides, effects

Model Selection

Chat (pick by need):

glm-4.7 — Latest flagship, best quality, agentic coding
glm-4.7-flash — Fast, high quality
glm-4.6 — Reliable general use
glm-4.5-flash — Fastest, lower cost

Vision:

glm-4.6v — Best multimodal (images, video, files)
glm-4.6v-flash — Fast vision

Media:

glm-image — High-quality images (HD, ~20s)
cogview-4-250304 — Fast images (~5-10s)
cogvideox-3 — Video, up to 4K, 5-10s
viduq1-text/image — Vidu video generation

Implementation Patterns

Basic Chat

python

from zai import ZaiClient

client = ZaiClient(api_key="YOUR_KEY")

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.choices[0].message.content)

OpenAI SDK Compatibility

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_ZAI_KEY",
    base_url="https://api.z.ai/api/paas/v4/"
)
# Use exactly like OpenAI SDK

Streaming

python

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[...],
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Function Calling

python

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string"}
            },
            "required": ["city"]
        }
    }
}]

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto"
)

# Handle tool_calls in response.choices[0].message.tool_calls

Vision (Images/Video/Files)

python

response = client.chat.completions.create(
    model="glm-4.6v",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": "https://..."}},
            {"type": "text", "text": "Describe this image"}
        ]
    }]
)

Image Generation

python

response = client.images.generate(
    model="glm-image",
    prompt="A serene mountain at sunset",
    size="1280x1280",
    quality="hd"
)
print(response.data[0].url)  # Expires in 30 days

Video Generation (Async)

python

# Submit
response = client.videos.generate(
    model="cogvideox-3",
    prompt="A cat playing with yarn",
    size="1920x1080",
    duration=5
)
task_id = response.id

# Poll for result
import time
while True:
    result = client.async_result.get(task_id)
    if result.task_status == "SUCCESS":
        print(result.video_result[0].url)
        break
    time.sleep(5)

Web Search Integration

python

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[{"role": "user", "content": "Latest AI news?"}],
    tools=[{
        "type": "web_search",
        "web_search": {
            "enable": True,
            "search_result": True
        }
    }]
)
# Access response.web_search for sources

Thinking Mode (Chain-of-Thought)

python

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[...],
    thinking={"type": "enabled"},
    stream=True  # Recommended with thinking
)
# Access reasoning_content in response

Key Parameters

Parameter	Values	Notes
`temperature`	0.0-1.0	GLM-4.7: 1.0, GLM-4.5: 0.6 default
`top_p`	0.01-1.0	Default ~0.95
`max_tokens`	varies	GLM-4.7: 128K, GLM-4.5: 96K max
`stream`	bool	Enable SSE streaming
`response_format`	`{"type": "json_object"}`	Force JSON output

Error Handling

429: Rate limited — implement exponential backoff
401: Bad API key — verify credentials
sensitive: Content filtered — modify input

python

if response.choices[0].finish_reason == "tool_calls":
    # Execute function and continue conversation
elif response.choices[0].finish_reason == "length":
    # Increase max_tokens or truncate
elif response.choices[0].finish_reason == "sensitive":
    # Content was filtered

Reference Files

For detailed API specifications, consult:

references/chat-completions.md — Full chat API, parameters, models
references/tools-and-functions.md — Function calling, web search, retrieval
references/media-generation.md — Image, video, audio APIs
references/agents.md — Translation, slides, effects agents
references/error-codes.md — Error handling, rate limits

Maintainer

jrajasekera Core maintainer

Source details

Full Name: jrajasekera/claude-skills
Branch: main
Path in repo: skills/z-ai-api

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

jrajasekera/claude-skills

openrouter-api

OpenRouter API integration for unified access to 400+ LLM models from 70+ providers. Use when building applications that need to call OpenRouter's API for chat completions, streaming, tool calling, structured outputs, or model routing. Triggers on OpenRouter, model routing, multi-model, provider fallbacks, or when users need to access multiple LLM providers through a single API.

1 0

Explore

jrajasekera/claude-skills

sqlite-optimization

Optimize SQLite database performance through configuration, schema design, indexing, and query tuning. Use when users ask to improve SQLite speed, reduce latency, optimize queries, configure PRAGMAs, fix slow queries, handle concurrency, optimize writes/inserts, or tune SQLite for production. Triggers on mentions of SQLite performance, slow queries, PRAGMA settings, WAL mode, indexing strategies, bulk inserts, or database maintenance (VACUUM, ANALYZE).

1 0

Explore

jrajasekera/claude-skills

codex-review

Use after creating design docs or implementation plans to get cross-agent review from Codex. Auto-triggers for non-trivial plans; asks first for simple changes. Captures feedback, addresses critical issues, presents minor concerns for user decision.

1 0

Explore

jrajasekera/claude-skills

pandoc-converter

Convert documents between formats using Pandoc. Use when the user asks to convert files between formats like markdown, docx, html, pdf, latex, epub, rtf, csv, xlsx, or pptx. Triggers on requests like "convert this to Word", "export as PDF", "turn this markdown into HTML", or "convert the CSV to a table".

1 0

Explore

jrajasekera/claude-skills

article-extractor

Extract clean article content from URLs and save as markdown. Triggers when user provides a webpage URL and wants to download it, extract content, get a clean version without ads, capture an article for offline reading, save an article, grab content from a page, archive a webpage, clip an article, or read something later. Handles blog posts, news articles, tutorials, documentation pages, and similar web content. Supports Wayback Machine for dead links or paywalled content. This skill handles the entire workflow - do NOT use web_fetch or other tools first, just call the extraction script directly with the URL.

1 0

Explore

jrajasekera/claude-skills

venice-ai-api

Venice.ai API integration for privacy-first AI applications. Use when building applications with Venice.ai API for chat completions, image generation, video generation, text-to-speech, speech-to-text, or embeddings. Triggers on Venice, Venice.ai, uncensored AI, privacy-first AI, or when users need OpenAI-compatible API with uncensored models.

1 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Z.ai API Skill

Quick Reference

Core Endpoints

Model Selection

Implementation Patterns

Basic Chat

OpenAI SDK Compatibility

Streaming

Function Calling

Vision (Images/Video/Files)

Image Generation

Video Generation (Async)

Web Search Integration

Thinking Mode (Chain-of-Thought)

Key Parameters

Error Handling

Reference Files

Recommended Agent Skills

openrouter-api

sqlite-optimization

codex-review

pandoc-converter

article-extractor

venice-ai-api