Markdown Converter
Agent skill for markdown-converter
Use this skill when users want to generate images using OpenAI's image generation API (DALL-E or gpt-image-1), or extract text from images using OCR. Invoke when users request AI-generated images, artwork, logos, illustrations, visual content from text prompts, or need to extract text/data from images.
Sign in to like and favorite skills
Generate images from text prompts and extract text from images using OpenAI's APIs.
imggen is a command-line tool that interfaces with OpenAI's image generation API. It supports multiple models (gpt-image-1, dall-e-3, dall-e-2) and provides options for image size, quality, format, and style.
imggen binary installed and available in PATHOPENAI_API_KEY environment variable set with a valid OpenAI API keyimggen [flags] "prompt"
| Flag | Short | Default | Description |
|---|---|---|---|
| | | Model: gpt-image-1, dall-e-3, dall-e-2 |
| | | Image dimensions |
| | | Quality level |
| | | Number of images (1-10 for gpt-image-1, 1 for dall-e-3) |
| | auto-generated | Output filename or directory |
| | | Output format: png, jpeg, webp |
| | Style for dall-e-3: vivid, natural | |
| | | Transparent background (gpt-image-1 + png/webp only) |
| | Prompt (can be specified multiple times) | |
| | | Number of parallel workers for multiple prompts |
| | Override API key |
OPENAI_API_KEY is set in the environmentThe tool outputs:
Generated files are saved to the current working directory with timestamp-based names (e.g.,
image-20251216-120000.png) unless --output is specified.
All image generation costs are automatically logged to
~/.imggen/sessions.db. View costs using the cost subcommand:
# View total costs imggen cost # View today's costs imggen cost today # View this week's costs (last 7 days) imggen cost week # View this month's costs (last 30 days) imggen cost month # View costs by provider imggen cost provider
In interactive mode (
imggen -i), use the cost or $ command:
cost today - Today's costscost week - This week's costscost month - This month's costscost total - All-time totalcost provider - Breakdown by providercost session - Current session's costsManage the SQLite database storing sessions and cost data:
# Reset database (delete all data) imggen db reset # Reset with backup of old data imggen db reset --backup # Show database location and stats imggen db info
imggen "a sunset over mountains"
imggen -m dall-e-3 -s 1792x1024 -q hd "panoramic view of a futuristic city"
imggen -n 4 -q high "abstract geometric pattern"
imggen -t -f png "minimalist tech company logo, flat design"
imggen -o hero-image.png "website hero banner with gradient"
imggen -m dall-e-3 --style natural "professional headshot, studio lighting"
# Generate multiple images with --prompt flag imggen --prompt "a sunset" --prompt "a cat" --prompt "a dog" -o ./output # Short form with parallel processing (3 workers) imggen -P "sunset" -P "mountains" -P "ocean" -o ./images -p 3
# From a text file (one prompt per line) imggen batch prompts.txt -o ./output # From a JSON file with per-prompt options imggen batch prompts.json -o ./output # With parallel processing imggen batch prompts.txt -o ./output -p 3
Generate multiple images from command-line prompts using the
--prompt/-P flag:
imggen --prompt "a sunset over mountains" --prompt "a cat playing piano" -o ./output
This processes all prompts and saves images to the output directory with indexed filenames:
001-a-sunset-over-mountains.png002-a-cat-playing-piano.pngUse
--parallel/-p to control concurrent processing (default: 1 = sequential).
Generate multiple images from a file of prompts using the
batch subcommand:
imggen batch <input-file> [flags]
Text file (.txt) - One prompt per line (lines starting with
# are ignored):
a sunset over mountains a cat playing piano abstract geometric art
JSON file (.json) - Array of objects with optional per-prompt settings:
[ {"prompt": "a sunset over mountains"}, {"prompt": "a cat playing piano", "model": "dall-e-3", "quality": "hd"}, {"prompt": "abstract art", "size": "1792x1024"} ]
| Flag | Short | Default | Description |
|---|---|---|---|
| | current dir | Output directory |
| | | Default model |
| | model default | Default image size |
| | model default | Default quality level |
| | | Output format |
| | | Number of parallel workers |
| | Stop on first error | |
| | Delay between requests (ms) |
Common errors and solutions:
OPENAI_API_KEY environment variable--count valueCosts per image (USD):
| Size | Low | Medium | High |
|---|---|---|---|
| 1024x1024 | $0.011 | $0.042 | $0.167 |
| 1536x1024 | $0.016 | $0.063 | $0.250 |
| 1024x1536 | $0.016 | $0.063 | $0.250 |
| Size | Standard | HD |
|---|---|---|
| 1024x1024 | $0.040 | $0.080 |
| 1024x1792 | $0.080 | $0.120 |
| 1792x1024 | $0.080 | $0.120 |
| Size | Cost |
|---|---|
| 256x256 | $0.016 |
| 512x512 | $0.018 |
| 1024x1024 | $0.020 |
Extract text from images using OpenAI's vision API with optional structured output support.
imggen ocr <image-path> [flags]
| Flag | Short | Default | Description |
|---|---|---|---|
| | | Model: gpt-5.2, gpt-5-mini, gpt-5-nano |
| | JSON schema file for structured output | |
| | Name for the JSON schema | |
| | Suggest a JSON schema based on image content | |
| | auto | Custom extraction prompt |
| | stdout | Output file |
| Image URL instead of file path | ||
| | Override API key | |
| | | Log HTTP requests and responses |
| Model | Cost (Input) | Cost (Output) | Best For |
|---|---|---|---|
| gpt-5-nano | $0.05/1M tokens | $0.40/1M tokens | Ultra budget, simple text |
| gpt-5-mini | $0.25/1M tokens | $2.00/1M tokens | Cost-effective, most OCR tasks |
| gpt-5.2 | $1.75/1M tokens | $14.00/1M tokens | Complex documents, highest accuracy |
imggen ocr document.png
imggen ocr --url https://example.com/image.png
imggen ocr receipt.jpg -o extracted.txt
# Create a schema file (invoice_schema.json): # { # "type": "object", # "properties": { # "vendor": {"type": "string"}, # "date": {"type": "string"}, # "total": {"type": "number"}, # "items": { # "type": "array", # "items": { # "type": "object", # "properties": { # "name": {"type": "string"}, # "price": {"type": "number"} # }, # "required": ["name", "price"], # "additionalProperties": false # } # } # }, # "required": ["vendor", "date", "total"], # "additionalProperties": false # } imggen ocr receipt.jpg --schema invoice_schema.json -o invoice.json
# Analyze image and suggest appropriate schema imggen ocr document.png --suggest-schema # Save suggested schema to file imggen ocr document.png --suggest-schema -o suggested_schema.json
imggen ocr complex-document.pdf -m gpt-5.2
imggen ocr business-card.jpg -p "Extract the name, title, email, and phone number"
When using the
--schema flag, the output will be structured JSON matching your schema. This is useful for:
The schema must follow JSON Schema draft-07 format with
additionalProperties: false for strict validation.