Gemini Web Client
Image generation skill using Gemini Web. Generates images from text prompts via Google Gemini. Also supports text generation. Use as the image generation backend for other skills like cover-image, xhs
Supports:
Text generation
Image generation (download + save)
Reference images for vision input (attach local images)
Multi-turn conversations via persisted
--sessionId
Script Directory
Important: All scripts are located in the scripts/ subdirectory of this skill.
Agent Execution Instructions:
Determine this SKILL.md file's directory path as
SKILL_DIRScript path =
${SKILL_DIR}/scripts/<script-name>.tsReplace all
${SKILL_DIR}in this document with the actual path
Script Reference:
scripts/main.ts
CLI entry point for text/image generation
scripts/gemini-webapi/*
TypeScript port of gemini_webapi (GeminiClient, types, utils)
Quick start
npx -y bun ${SKILL_DIR}/scripts/main.ts "Hello, Gemini"
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Explain quantum computing"
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# Multi-turn conversation (agent generates unique sessionId)
npx -y bun ${SKILL_DIR}/scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123
npx -y bun ${SKILL_DIR}/scripts/main.ts "What number?" --sessionId my-unique-id-123Commands
Text generation
Image generation
Vision input (reference images)
Output formats
Options
--prompt <text>, -p
Prompt text
--promptfiles <files...>
Read prompt from files (concatenated in order)
--model <id>, -m
Model: gemini-3-pro (default), gemini-2.5-pro, gemini-2.5-flash
--image [path]
Generate image, save to path (default: generated.png)
--reference <files...>, --ref <files...>
Reference images for vision input
--sessionId <id>
Session ID for multi-turn conversation (agent generates unique ID)
--list-sessions
List saved sessions (max 100, sorted by update time)
--json
Output as JSON
--login
Refresh cookies only, then exit
--cookie-path <path>
Custom cookie file path
--profile-dir <path>
Chrome profile directory
--help, -h
Show help
CLI note: scripts/main.ts supports text generation, image generation, reference images (--reference/--ref), and multi-turn conversations via --sessionId.
Models
gemini-3-pro- Default, latest modelgemini-2.5-pro- Previous generation progemini-2.5-flash- Fast, lightweight
Authentication
First run opens Chrome to authenticate with Google. Cookies are cached for subsequent runs.
Environment variables
GEMINI_WEB_DATA_DIR
Data directory
GEMINI_WEB_COOKIE_PATH
Cookie file path
GEMINI_WEB_CHROME_PROFILE_DIR
Chrome profile directory
GEMINI_WEB_CHROME_PATH
Chrome executable path
Examples
Generate text response
Generate image
Get JSON output for parsing
Generate image from prompt files
Multi-turn conversation
Session files are stored in ~/Library/Application Support/michi-skills/gemini-web/sessions/<id>.json and contain:
id: Session IDmetadata: Gemini chat metadata for continuationmessages: Array of{role, content, timestamp, error?}createdAt,updatedAt: Timestamps
Last updated