Agent skill

gemini-file-api

Work with Gemini File API for document caching and efficient multi-query workflows. Use when discussing file uploads, caching strategies, token optimization, or comparing File API vs inline document approaches.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/gemini-file-api

SKILL.md

Gemini File API Integration

Purpose

Understand and work with Gemini's File API for efficient document processing.

Architecture Overview

┌─────────────────────────────────────────────┐
│  Document Upload                            │
│  UploadFile() -> Gemini File API           │
│  Returns: URI with 48hr TTL                │
└─────────────────┬───────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│  FileCache (in-memory)                      │
│  Prevents re-uploads within session        │
│  Key: file path, Value: FileReference      │
└─────────────────┬───────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│  GenerateContent()                          │
│  References files by URI (not bytes)       │
│  Reuse same URI for multiple queries       │
└─────────────────────────────────────────────┘

Client Usage

Creating a Client

go
// Reads GEMINI_API_KEY from environment
client, err := gemini.NewClient(ctx)
if err != nil {
    return err
}
defer client.Close()

Uploading Files

Single file:

go
ref, err := client.UploadFile(ctx, "/path/to/doc.pdf", "application/pdf")
// ref.URI is reusable for 48 hours

Multiple files (parallel upload):

go
paths := []string{"/path/to/w2.pdf", "/path/to/paystub.pdf"}
mimeTypes := map[string]string{
    "/path/to/w2.pdf": "application/pdf",
    "/path/to/paystub.pdf": "application/pdf",
}
refs, err := client.UploadFiles(ctx, paths, mimeTypes)
// refs is map[string]*FileReference

Generating Content

With File API (recommended for multiple queries):

go
resp, err := client.GenerateContent(ctx, &gemini.GenerateRequest{
    Model:        gemini.ModelFlash,
    Prompt:       "Analyze this W2 for annual income",
    SystemPrompt: "You are a mortgage underwriter...",
    FileURIs:     []string{ref.URI},
    FileMIMEs:    map[string]string{ref.URI: "application/pdf"},
})

With Inline bytes (simpler for single queries):

go
resp, err := client.GenerateContent(ctx, &gemini.GenerateRequest{
    Model:  gemini.ModelFlash,
    Prompt: "Analyze this document",
    InlineFiles: []gemini.InlineFile{{
        Data:     fileBytes,
        MimeType: "application/pdf",
    }},
})

Data Structures

FileReference

go
type FileReference struct {
    Name      string    // Gemini-assigned name
    URI       string    // Reusable URI for requests
    MimeType  string
    State     string    // Processing state
    CreatedAt time.Time
    ExpiresAt time.Time // 48 hours from creation
    SizeBytes int64
}

GenerateRequest

go
type GenerateRequest struct {
    Model        string              // ModelFlash or ModelPro
    Prompt       string              // User prompt
    SystemPrompt string              // System instructions
    FileURIs     []string            // Gemini file URIs
    FileMIMEs    map[string]string   // URI -> MIME type
    InlineFiles  []InlineFile        // Direct bytes (alternative)
    Temperature  *float32
    MaxTokens    *int32
}

GenerateResponse

go
type GenerateResponse struct {
    Text           string
    FinishReason   string
    TokensUsed     int64
    ModelUsed      string
    ProcessingTime time.Duration
}

Model Selection

Model Constant Use Case
Gemini Flash gemini.ModelFlash Fast, cheaper, most tasks
Gemini Pro gemini.ModelPro Complex reasoning, escalation

File API vs Inline Comparison

Aspect File API Inline
Upload Once (cached 48hr) Every request
Best for Multiple queries, same docs Single query
Token cost Lower per query Higher per query
Latency Higher first query Consistent
Break-even ~7 queries N/A

FileCache Details

Located in internal/gemini/cache.go:

  • Thread-safe: Uses RWMutex
  • Expiry-aware: Checks 48hr TTL before returning
  • Methods: Get, Set, Delete, Cleanup, GetAllURIs

Troubleshooting

"GEMINI_API_KEY not set"

bash
export GEMINI_API_KEY="your-key"

Upload failures

  • Check file exists and is readable
  • Verify MIME type matches content
  • Check file size limits

Rate limits

Add delays between requests:

go
time.Sleep(500 * time.Millisecond)

Related Files

  • internal/gemini/client.go - Main client implementation
  • internal/gemini/cache.go - FileCache implementation
  • cmd/benchmark/main.go - Performance comparison tool

Didn't find tool you were looking for?

Be as detailed as possible for better results