# gemini-image-generator > Generate images using Google Gemini NanoBanana via browser automation. Use this skill for general-purpose AI image generation from text prompts. Includes persistent authentication, automatic environment setup, and reference image support for style matching. - Author: 松本敏彦 - Repository: taiyousan15/taisun_agent - Version: 20260121010308 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/taiyousan15/taisun_agent - Web: https://mule.run/skillshub/@@taiyousan15/taisun_agent~gemini-image-generator:20260121010308 --- --- name: gemini-image-generator description: Generate images using Google Gemini NanoBanana via browser automation. Use this skill for general-purpose AI image generation from text prompts. Includes persistent authentication, automatic environment setup, and reference image support for style matching. --- # Gemini Image Generator Gemini NanoBananaを使った汎用AI画像生成スキル。 ## When to Use This Skill Trigger when user: - Asks to generate/create images with AI - Mentions "Gemini image", "generate picture", "create artwork" - Requests visual content from text descriptions - Wants to produce illustrations or graphics - **Wants to create images matching a reference image's style** (NEW!) **For specific use cases, use specialized skills:** - **LP/セールスレター画像** → `gemini-lp-generator` - **ウェビナースライド** → `gemini-slide-generator` ## Quick Start ```bash cd /path/to/gemini-image-generator # 1. Check authentication python scripts/run.py auth_manager.py status # 2. Authenticate (if needed) python scripts/run.py auth_manager.py setup # 3. Generate image (basic) python scripts/run.py image_generator.py \ --prompt "sunset over mountains, watercolor style" \ --output output/my_image.png # 4. Generate with reference image (NEW!) python scripts/run.py image_generator.py \ --prompt "犬を描いて" \ --reference-image "/path/to/reference.png" \ --output output/styled_dog.png ``` ## How It Works ### Basic Mode (テキストのみ) 1. Navigate to `gemini.google.com` 2. Click "ツール" (Tools) button 3. Select "画像を作成" (Create Image) - Activates NanoBanana 4. Enter prompt and generate 5. Download generated image ### Reference Image Mode (参考画像あり) - NEW! 1. Upload reference image to Gemini 2. AI analyzes visual elements (style, colors, lighting, etc.) 3. Extract analysis as YAML format 4. Generate optimized meta-prompt 5. Create new image with matching style ``` ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │ 📷 Reference │ → │ 📋 YAML │ → │ 📝 Optimized │ │ Image │ │ Analysis │ │ Prompt │ └────────────────┘ └────────────────┘ └────────────────┘ │ ▼ ┌────────────────┐ │ 🖼️ Generated │ │ Image │ └────────────────┘ ``` ## Parameters | Parameter | Required | Default | Description | |-----------|----------|---------|-------------| | `--prompt` | Yes | - | Image generation prompt | | `--output` | No | `output/generated_image.png` | Output file path | | `--reference-image` | No | - | Reference image for style extraction | | `--yaml-output` | No | - | Save YAML analysis to file | | `--show-browser` | No | False | Show browser for debugging | | `--timeout` | No | 180 | Max wait time in seconds | ## Prompt Examples ### Basic Examples (テキストのみ) ```bash # Landscape python scripts/run.py image_generator.py \ --prompt "serene sunset over snow-capped mountains, warm orange sky, photorealistic" # Art style python scripts/run.py image_generator.py \ --prompt "watercolor painting of a cat sitting by window, soft colors" # Product photo python scripts/run.py image_generator.py \ --prompt "professional product photography, white background, soft lighting" ``` ### Reference Image Examples (参考画像あり) - NEW! ```bash # Match style of reference image python scripts/run.py image_generator.py \ --prompt "犬を描いて" \ --reference-image "examples/watercolor_cat.png" \ --output output/watercolor_dog.png # Save YAML analysis for review python scripts/run.py image_generator.py \ --prompt "森の風景" \ --reference-image "examples/sunset.jpg" \ --yaml-output output/analysis.yaml \ --output output/forest.png # Debug mode with browser visible python scripts/run.py image_generator.py \ --prompt "カフェの内装" \ --reference-image "examples/cozy_room.png" \ --show-browser \ --output output/cafe.png ``` ### Standalone Tools ```bash # Extract YAML only (without generating image) python scripts/run.py prompt_extractor.py \ --image "examples/reference.png" \ --output analysis.yaml # Generate prompt from YAML python scripts/run.py meta_prompt.py \ --yaml analysis.yaml \ --request "猫を描いて" ``` ## Authentication This skill manages browser authentication for all Gemini-based skills: - `gemini-slide-generator` (shares browser profile) - `gemini-lp-generator` (shares browser profile) ```bash # Check status python scripts/run.py auth_manager.py status # Setup (opens browser for Google login) python scripts/run.py auth_manager.py setup # Clear session python scripts/run.py auth_manager.py clear ``` ## Troubleshooting | Problem | Solution | |---------|----------| | Not authenticated | Run `auth_manager.py setup` | | Timeout | Increase with `--timeout 300` | | UI not found | Use `--show-browser` to debug | | Generation refused | Modify prompt (avoid restricted content) | ## Data Storage - `data/browser_profile/` - Browser session (shared with other Gemini skills) - `data/state.json` - Authentication state - `output/` - Generated images ## Architecture ``` scripts/ ├── config.py # Centralized settings ├── browser_utils.py # BrowserFactory and StealthUtils ├── auth_manager.py # Authentication management ├── image_generator.py # Image generation (with reference image support) ├── prompt_extractor.py # Extract visual elements as YAML (NEW!) ├── meta_prompt.py # Generate optimized prompts from YAML (NEW!) └── run.py # Wrapper script for venv docs/ └── UPGRADE_SPEC.md # Feature specification with diagrams ``` ## Notes - **First generation takes longer** (browser startup) - **Subsequent generations faster** (session reuse) - **Authentication persists** ~7 days - **UI selectors may break** when Gemini updates