# gemini-batch

> This skill should be used when the user asks to "use Gemini Batch API", "process documents at scale", "submit a batch job", "upload files to Gemini", or needs large-scale LLM processing. Includes production gotchas and best practices.

- Author: Edwin Hu
- Repository: edwinhu/claude-plugins
- Version: 20260206221707
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-07
- Source: https://github.com/edwinhu/claude-plugins
- Web: https://mule.run/skillshub/@@edwinhu/claude-plugins~gemini-batch:20260206221707

---

---
name: gemini-batch
version: 1.0
description: This skill should be used when the user asks to "use Gemini Batch API", "process documents at scale", "submit a batch job", "upload files to Gemini", or needs large-scale LLM processing. Includes production gotchas and best practices.
---

# Gemini Batch API Skill

Large-scale asynchronous document processing using Google's Gemini models.

## When to Use

- Process thousands of documents with the same prompt
- Cost-effective bulk extraction (50% cheaper than synchronous API)
- Jobs that can tolerate 24-hour completion windows

## IRON LAW: Use Examples First, Never Guess API

**READ EXAMPLES BEFORE WRITING ANY CODE. NO EXCEPTIONS.**

### The Rule

```
User asks for batch API work
    ↓
MANDATORY: Read examples/batch_processor.py or examples/icon_batch_vision.py
    ↓
Copy the pattern exactly
    ↓
DO NOT guess parameter names
DO NOT try wrapper types
DO NOT improvise API calls
```

### Why This Matters

The Batch API has non-obvious requirements that will fail silently:
1. **Metadata must be flat primitives** - Nested objects cause cryptic errors
2. **Parameter is `dest=` not `destination=`** - Wrong name → TypeError
3. **Config is plain dict** - Not a wrapper type
4. **Examples are authoritative** - Working code beats assumptions

**Rationale:** Previous agents wasted hours debugging API errors that the examples would have prevented. The patterns in `examples/` are battle-tested production code.

### Rationalization Table - STOP If You Catch Yourself Thinking:

| Excuse | Reality | Do Instead |
|--------|---------|------------|
| "I know how APIs work" | You're overconfident about non-obvious gotchas | Read examples first |
| "I can figure it out" | You'll waste 30+ minutes on trial-and-error | Copy working patterns |
| "The examples might be outdated" | They're maintained and tested | Trust the examples |
| "I need to customize anyway" | Your customization comes AFTER copying base pattern | Start with examples, then adapt |
| "Reading examples takes too long" | You'll save 30 minutes debugging with 2 minutes of reading | Read examples first |
| "My approach is simpler" | Your simpler approach already failed | Use proven patterns |

### Red Flags - STOP If You Catch Yourself Thinking:

- **"Let me try `destination=` instead of `dest=`"** → You're about to cause a TypeError. Read examples.
- **"I'll create a `CreateBatchJobConfig` object"** → You're instantiating a type instead of using a plain dict. Stop.
- **"I'll nest metadata like a normal API"** → You'll trigger BigQuery type errors. Flatten your data.
- **"This should work like other Google APIs"** → Your assumption is wrong; this API is different.
- **"I'll figure out the JSONL format"** → You'll waste time. Copy from examples instead.

### MANDATORY Checklist Before ANY Batch API Code

- [ ] Read `examples/batch_processor.py` OR `examples/icon_batch_vision.py`
- [ ] Identify which example matches the use case (Standard API vs Vertex AI)
- [ ] Copy the example's API call pattern **exactly**
- [ ] Copy the example's JSONL structure **exactly**
- [ ] Copy the example's metadata structure **exactly**
- [ ] Adapt for specific needs only after copying base pattern

**Enforcement:** Writing batch API code without reading examples first violates this IRON LAW and will result in preventable errors.

## Prerequisites

### Install gcloud SDK

```bash
# macOS: Install Google Cloud SDK via Homebrew
brew install google-cloud-sdk

# Linux: Install Google Cloud SDK from official sources
curl https://sdk.cloud.google.com | bash
```

### Authentication Setup

```bash
# Authenticate with Google Cloud Platform
gcloud auth login

# Set up Application Default Credentials for Python libraries
gcloud auth application-default login

# Enable Vertex AI API in your project
gcloud services enable aiplatform.googleapis.com
```

**Why both auth methods?**
- `gcloud auth login`: For gsutil and gcloud CLI commands
- `gcloud auth application-default login`: For google-generativeai Python library
- **CRITICAL:** Vertex AI requires ADC (step 2), not just API key

### Create GCS Bucket

```bash
# Create bucket in us-central1 (required region)
gsutil mb -l us-central1 gs://your-batch-bucket

# Verify bucket location is us-central1
gsutil ls -L -b gs://your-batch-bucket | grep "Location"
```

See `references/gcs-setup.md` for complete setup guide.

## Quick Start

### Standard Gemini API (API Key)

Uses the Gemini File API for input. Results returned via `batch_job.dest.file_name`.

```python
from google import genai

client = genai.Client()  # Uses GOOGLE_API_KEY env var

# Upload JSONL to File API
uploaded = client.files.upload(
    file="requests.jsonl",
    config={"mime_type": "application/jsonl"}
)

# Submit batch job
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src=uploaded.name,  # "files/..." URI
    config={"display_name": "my-batch-job"}
)

# Results available at job.dest.file_name after completion
```

### Vertex AI (Recommended for GCS workflows)

Uses GCS URIs directly. Supports `dest=` parameter for output location.

```python
from google import genai

# Use Vertex AI with ADC (not API key)
client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

# Submit batch job with GCS paths
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/requests.jsonl",   # GCS input
    dest="gs://bucket/outputs/"          # GCS output (Vertex AI only!)
)
```

**Key difference:** Standard API uses File API (`files/...`), Vertex AI uses GCS (`gs://...`) with explicit `dest=` parameter.

## Core Workflow

**Standard API:**
1. **Create JSONL** request file with prompts
2. **Upload JSONL** to File API via `client.files.upload()`
3. **Submit batch job** via `client.batches.create(src=uploaded.name)`
4. **Poll for completion** (jobs expire after 24 hours)
5. **Download results** from `job.dest.file_name`

**Vertex AI:**
1. **Upload files** to GCS bucket (us-central1 region required)
2. **Create JSONL** request file with document URIs and prompts
3. **Submit batch job** via `client.batches.create(src=..., dest=...)`
4. **Poll for completion** (jobs expire after 24 hours)
5. **Download and parse** results from GCS output URI
6. **Handle failures** gracefully (partial failures are common)

## IRON LAW: Metadata and API Call Structure

**YOU MUST USE FLAT PRIMITIVES FOR METADATA. YOU MUST USE SIMPLE STRINGS FOR API PARAMETERS.**

### Rule 1: Metadata Structure

```
CORRECT ✓
"metadata": {
    "request_id": "icon_123",        # String
    "file_name": "copy.svg",         # String
    "file_size": 1024                # Integer
}

WRONG ✗
"metadata": {
    "request_id": "icon_123",
    "file_info": {                   # ← NESTED OBJECT FAILS!
        "name": "copy.svg",
        "size": 1024
    }
}

WORKAROUND (if complex data needed)
"metadata": {
    "request_id": "icon_123",
    "file_info": json.dumps({"name": "copy.svg", "size": 1024})  # JSON string OK
}
```

**Why:** Vertex AI stores metadata in BigQuery-compatible format. BigQuery doesn't support nested types. Violation causes: `"metadata" in the specified input data is of unsupported type.`

### Rule 2: API Call Structure

**Standard API (File API):**
```python
CORRECT ✓
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src=uploaded_file.name,               # "files/..." URI from File API
    config={"display_name": "my-job"}     # Just a dict
)
# Results at: job.dest.file_name (after completion)
```

**Vertex AI (GCS):**
```python
CORRECT ✓
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/input.jsonl",        # GCS URI
    dest="gs://bucket/output/",           # GCS output (VERTEX AI ONLY!)
    config={"display_name": "my-job"}
)

WRONG ✗
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/input.jsonl",
    destination="gs://bucket/output/",    # ← WRONG PARAM NAME! Use dest=
)

WRONG ✗
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/input.jsonl",
    config=types.CreateBatchJobConfig(    # ← DON'T INSTANTIATE TYPES!
        dest="gs://bucket/output/"
    )
)
```

**Why:**
- Standard API: Uses File API for input, outputs to managed file location
- Vertex AI: Uses GCS URIs, supports `dest=` for output location
- Parameter is `dest=` (not destination). Config is a plain dict (not a type instance).

### Rationalization Table - STOP If You Catch Yourself Thinking:

| Excuse | Reality | Do Instead |
|--------|---------|------------|
| "Nested metadata is cleaner" | Your code will fail silently with cryptic errors | Flatten or use `json.dumps()` |
| "I'll use `dest=` with Standard API" | Standard API doesn't support `dest=`; it's Vertex AI only | Use File API pattern for Standard API |
| "I'll try `destination=` parameter" | You'll get a TypeError; parameter doesn't exist | Use `dest=` (Vertex AI only) |
| "I should use `CreateBatchJobConfig`" | You're confusing internal typing with API calls | Pass plain dict to `config=` |
| "Other APIs accept nested objects" | Your assumption breaks here; it's BigQuery-backed | Follow the examples |
| "I'll fix it if it breaks" | Your job fails 5 minutes after submission | Get it right the first time |

### Pre-Submission Validation

```python
# Add this check BEFORE submitting batch job
def validate_metadata(metadata: dict):
    """Ensure metadata contains only primitive types."""
    for key, value in metadata.items():
        if isinstance(value, (dict, list)):
            raise ValueError(
                f"Metadata '{key}' is {type(value).__name__}. "
                f"Only primitives (str, int, float, bool) allowed. "
                f"Use json.dumps() for complex data."
            )
        if not isinstance(value, (str, int, float, bool, type(None))):
            raise ValueError(f"Unsupported type for '{key}': {type(value)}")

# Validate all requests before submission:
for request in batch_requests:
    validate_metadata(request["metadata"])
```

**Enforcement:** Jobs will fail if metadata contains nested objects. There is no workaround for this requirement.

## Key Gotchas

| Issue | Solution |
|-------|----------|
| **Nested metadata fails** | **Use flat primitives or `json.dumps()` for complex data** |
| **TypeError: unexpected keyword** | **Use `dest=` not `destination=` (Vertex AI only)** |
| **Mixing API patterns** | **Standard API: File API + no dest. Vertex AI: GCS + dest** |
| Auth errors with Vertex AI | Run `gcloud auth application-default login` |
| vertexai=True requires ADC | API key is ignored with vertexai=True |
| Missing aiplatform API | Run `gcloud services enable aiplatform.googleapis.com` |
| Region mismatch (Vertex) | Use `us-central1` bucket only |
| Wrong URI format (Vertex) | Use `gs://` not `https://` |
| Invalid JSONL | Use `scripts/validate_jsonl.py` |
| Image batch: inline data | Use `fileData.fileUri` for batch, not inline |
| Duplicate IDs | Hash file content + prompt for unique IDs |
| Large PDFs fail | Split at 50 pages / 50MB max |
| JSON parsing fails | Use robust extraction (see gotchas.md) |
| Output not found (Vertex) | Output URI is prefix, not file path |

**Top 3 mistakes** (bolded above):
1. Using nested objects in metadata instead of flat primitives
2. Mixing Standard API and Vertex AI patterns
3. Using `destination=` instead of `dest=` (Vertex AI)

See `references/gotchas.md` for detailed solutions (now with Gotchas 10 & 11).

## Rate Limits

| Limit | Value |
|-------|-------|
| Max requests per JSONL | 10,000 |
| Max concurrent jobs | 10 |
| Max job size | 100MB |
| Job expiration | 24 hours |

## Recommended Models

| Model | Use Case | Cost |
|-------|----------|------|
| `gemini-2.5-flash-lite` | Most batch jobs | Lowest |
| `gemini-2.5-flash` | Complex extraction | Medium |
| `gemini-2.5-pro` | Highest accuracy | Highest |

## Additional Resources

### References
- `references/gcs-setup.md` - **NEW:** Complete GCS and Vertex AI setup guide
- `references/gotchas.md` - 9 critical production gotchas (updated auth section)
- `references/best-practices.md` - Idempotent IDs, state tracking, validation
- `references/troubleshooting.md` - Common errors and debugging
- `references/vertex-ai.md` - Enterprise alternative with comparison
- `references/cli-reference.md` - gsutil and gcloud commands

### Examples
- `examples/icon_batch_vision.py` - **NEW:** Batch vision analysis with Vertex AI
- `examples/batch_processor.py` - Complete GeminiBatchProcessor class
- `examples/pipeline_template.py` - Customizable pipeline template

### Scripts
- `scripts/validate_jsonl.py` - Validate JSONL before submission
- `scripts/test_single.py` - Test single request before batch

## External Documentation

- [Gemini Batch API Guide](https://ai.google.dev/gemini-api/docs/batch)
- [Google Cloud Storage](https://cloud.google.com/python/docs/reference/storage/latest)
- [Vertex AI Batch Prediction](https://cloud.google.com/vertex-ai/docs/predictions/batch-predictions)

## Date Awareness

**Pattern from oh-my-opencode:** Gemini API and documentation evolve rapidly.

Current date: Use `datetime.now()` for:
- API version checking
- Model availability ("gemini-2.5-flash-lite available as of Dec 2024")
- Documentation freshness validation

For API features or model names with uncertainty, verify against current date and check latest Gemini API documentation.