Agent skill
web-ai-prompt-api
Chrome's built-in Prompt API implementation guide. Use Gemini Nano locally in browser for AI features - session management, multimodal input, structured output, streaming, Chrome Extensions. Reference for all Prompt API development.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/ajv009/web-ai-prompt-api
SKILL.md
Chrome Prompt API Reference
Complete implementation guide for Chrome's built-in Prompt API using Gemini Nano.
Hardware & OS Requirements
OS: Windows 10/11, macOS 13+ (Ventura+), Linux, ChromeOS (Platform 16389.0.0+) on Chromebook Plus Storage: 22 GB free (model downloaded separately) GPU: >4 GB VRAM OR CPU: 16 GB RAM + 4 cores Network: Unmetered connection for download Chrome: 138+ (Extensions in stable, Web in origin trial)
Not supported: Mobile (Android, iOS), non-Chromebook Plus ChromeOS
Check model size: chrome://on-device-internals
Model removed if storage <10 GB after download.
Language Support
From Chrome 140: English, Spanish, Japanese (input/output)
Availability Check
const availability = await LanguageModel.availability();
// Returns: "unavailable" | "downloadable" | "downloading" | "available"
CRITICAL: Always pass same options to availability() as you use in create(). Some models don't support certain modalities/languages.
Model Parameters
await LanguageModel.params();
// { defaultTopK: 3, maxTopK: 128, defaultTemperature: 1, maxTemperature: 2 }
Session Creation
Basic Session
const session = await LanguageModel.create();
With Custom Parameters
const params = await LanguageModel.params();
const session = await LanguageModel.create({
temperature: Math.min(params.defaultTemperature * 1.2, 2.0),
topK: params.defaultTopK
});
With Download Monitoring
const session = await LanguageModel.create({
monitor(m) {
m.addEventListener('downloadprogress', (e) => {
console.log(`Downloaded ${e.loaded * 100}%`);
});
}
});
With AbortSignal
const controller = new AbortController();
stopButton.onclick = () => controller.abort();
const session = await LanguageModel.create({
signal: controller.signal
});
Initial Prompts (Context Setting)
const session = await LanguageModel.create({
initialPrompts: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is the capital of Italy?' },
{ role: 'assistant', content: 'The capital of Italy is Rome.' },
{ role: 'user', content: 'What language is spoken there?' },
{ role: 'assistant', content: 'The official language is Italian.' }
]
});
Use cases:
- Resume stored sessions after browser restart
- N-shot prompting
- Set assistant personality/tone
- Provide conversation history
Multimodal Input (Origin Trial)
expectedInputs & expectedOutputs
const session = await LanguageModel.create({
expectedInputs: [
{
type: "text", // or "image", "audio"
languages: ["en" /* system */, "ja" /* user prompt */]
}
],
expectedOutputs: [
{ type: "text", languages: ["ja"] }
]
});
Input types: text, image, audio Output types: text only
Throws NotSupportedError if unsupported modality.
Using with Images/Audio
const session = await LanguageModel.create({
initialPrompts: [{
role: 'system',
content: 'Analyze images for patterns.'
}],
expectedInputs: [{ type: 'image' }]
});
// Append image
await session.append([{
role: 'user',
content: [
{ type: 'text', value: 'Analyze this image' },
{ type: 'image', value: fileInput.files[0] }
]
}]);
Prompting Methods
Non-Streaming (Short Responses)
const result = await session.prompt('Write me a haiku!');
console.log(result);
Streaming (Long Responses)
const stream = session.promptStreaming('Write me a long poem!');
for await (const chunk of stream) {
console.log(chunk); // Partial results as they arrive
}
With AbortSignal
const controller = new AbortController();
stopButton.onclick = () => controller.abort();
const result = await session.prompt('Write a poem', {
signal: controller.signal
});
Response Constraints (Structured Output)
Pass JSON Schema to get predictable JSON responses:
const schema = {
type: "object",
properties: {
hashtags: {
type: "array",
maxItems: 3,
items: {
type: "string",
pattern: "^#[^\\s#]+$"
}
}
},
required: ["hashtags"],
additionalProperties: false
};
const result = await session.prompt(
`Generate hashtags for: ${post}`,
{ responseConstraint: schema }
);
const data = JSON.parse(result); // Guaranteed valid JSON
Simple boolean example:
const schema = { "type": "boolean" };
const result = await session.prompt(
`Is this about pottery?\n\n${text}`,
{ responseConstraint: schema }
);
console.log(JSON.parse(result)); // true or false
Measure input quota usage:
const usage = session.measureInputUsage({ responseConstraint: schema });
Omit schema from input quota:
const result = await session.prompt(
`Summarize as JSON { rating } with 0-5 number:`,
{
responseConstraint: schema,
omitResponseConstraintInput: true
}
);
Response Prefixes
Guide model output format by prefilling assistant response:
const result = await session.prompt([
{ role: 'user', content: 'Create a TOML character sheet' },
{ role: 'assistant', content: '```toml\n', prefix: true }
]);
// Model continues from "```toml\n"
append() Method
Add messages to session without immediate response (useful for multimodal):
await session.append([
{
role: 'user',
content: [
{ type: 'text', value: 'First context message' },
{ type: 'image', value: imageFile }
]
}
]);
// Later, prompt with accumulated context
const result = await session.prompt('Analyze the images');
Promise fulfills when validated and appended.
Session Management
Check Quota
console.log(`${session.inputUsage}/${session.inputQuota}`);
const remaining = session.inputQuota - session.inputUsage;
When quota exceeded, oldest messages lost from context.
Clone Session
Clones inherit parameters, initial prompts, and history:
const mainSession = await LanguageModel.create({
initialPrompts: [{ role: 'system', content: 'Speak like a pirate' }]
});
const clone1 = await mainSession.clone();
const clone2 = await mainSession.clone({ signal: controller.signal });
// Independent conversations with same setup
await clone1.prompt('Tell me a joke about parrots');
await clone2.prompt('Tell me a joke about treasure');
Use for: Parallel conversations, "what if" scenarios, resource efficiency
Restore Session from localStorage
let sessionData = getFromLocalStorage(uuid) || {
initialPrompts: [],
topK: (await LanguageModel.params()).defaultTopK,
temperature: (await LanguageModel.params()).defaultTemperature
};
const session = await LanguageModel.create(sessionData);
// Track conversation
const stream = session.promptStreaming(prompt);
let result = '';
for await (const chunk of stream) {
result = chunk;
}
sessionData.initialPrompts.push(
{ role: 'user', content: prompt },
{ role: 'assistant', content: result }
);
localStorage.setItem(uuid, JSON.stringify(sessionData));
Destroy Session
session.destroy();
// Frees resources, aborts ongoing execution
// Session unusable after destroy
Best practice: Keep one empty session alive to keep model loaded. Destroy only when truly done.
User Activation Requirement
Model download requires user interaction:
if (navigator.userActivation.isActive) {
const session = await LanguageModel.create();
}
Sticky activation events: click, tap, keydown, mousedown
Permission Policy & iframes
Default: Top-level windows + same-origin iframes only
Grant cross-origin iframe access:
<iframe src="https://cross-origin.example.com/"
allow="language-model">
</iframe>
Not available in Web Workers (permission policy complexity)
Local Development (localhost)
- Go to
chrome://flags/#prompt-api-for-gemini-nano-multimodal-input - Select Enabled
- Relaunch Chrome
- Open DevTools, type:
await LanguageModel.availability(); - Should return
"available"
Troubleshoot:
- Restart Chrome
- Check
chrome://on-device-internals→ Model Status tab - Wait for model download to complete
Chrome Extensions
Available in Chrome 138+ stable for extensions.
Remove expired origin trial permissions:
{
"permissions": ["aiLanguageModelOriginTrial"] // REMOVE THIS
}
Register for current origin trial if needed.
Best Practices
Session management:
- Clone sessions to preserve initial prompts
- Use AbortController to let users stop responses
- Track inputUsage to warn before quota exceeded
- Destroy unused sessions to free memory
- Keep one empty session alive to keep model ready
Performance:
- Use streaming for long responses
- append() messages in advance for multimodal
- Monitor download progress, inform users
- Check availability before creating sessions
Structured output:
- Use JSON Schema for predictable responses
- LLMs good at generating schemas from descriptions
- Validate with JSON Schema validators
- Use prefix for format guidance
Context preservation:
- Store conversation in localStorage for restoration
- Use initialPrompts for session context
- Clone for parallel conversations with same context
Error Handling
try {
const session = await LanguageModel.create();
const result = await session.prompt('Hello');
} catch (err) {
if (err.name === 'AbortError') {
// User stopped generation
} else if (err.name === 'NotSupportedError') {
// Unsupported modality/language
} else {
console.error(err.name, err.message);
}
}
Key Limitations
- Desktop only (no mobile)
- 22 GB storage required
- Unmetered network for download
- No Web Workers support
- Output is text-only (input can be multimodal)
- Context window has token limits
- Model removed if storage <10 GB after download
Use Cases
- AI-powered search on page content
- Content classification/filtering
- Event extraction from pages
- Contact information extraction
- Chatbots with conversation memory
- Offline AI features
- Privacy-sensitive data processing
- Content summarization
- Translation assistance
- Image/audio analysis (multimodal)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?