API β
Client-facing reference. Three OpenAI-shape endpoints. Defaults shown for each model alias.
Endpoints β
/v1/images/generations β text β image β
Standard OpenAI Images API.
curl -X POST http://your-gateway:4000/v1/images/generations \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "protolabs/qwen-image",
"prompt": "a watercolor of a cat in a hat",
"size": "1024x1024",
"n": 1,
"response_format": "b64_json"
}'Request fields:
model(string, required) βprotolabs/qwen-imageor your own aliasprompt(string, required) β text descriptionsize(string, optional) βWxH(e.g.1024x1024,1216x832). Default inferred from prompt keywords or1024x1024n(int, optional) β number of images, default 1; runs in parallelresponse_format(string, optional) βb64_jsonis the only supported value;urlnot implemented (we don't host generated images)extra_body.seed(int, optional) β fix the random seedextra_body.negative_prompt(string, optional) β default"low quality, blurry"
Response:
{
"created": 1777757327,
"data": [
{ "b64_json": "iVBORw0KGgo..." }
]
}/v1/images/edits β image + prompt β image β
Standard OpenAI Images Edit API.
curl -X POST http://your-gateway:4000/v1/images/edits \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-F "model=protolabs/qwen-image-edit" \
-F "prompt=make the cat blue" \
-F "image=@/path/to/cat.png"Request fields:
model(string, required)prompt(string, required) β edit instructionimage(binary, required) β init imagen,seed,negative_prompt(same as generation)
Response: same shape as generation (one or more b64_json images).
Caveat: Open WebUI doesn't currently use this endpoint for follow-up edits in chat β it routes through /v1/chat/completions. The endpoint is exposed for programmatic clients that need direct edit access.
/v1/chat/completions β multi-turn chat with image output β β
The conversational UX. Use this from chat clients.
curl -X POST http://your-gateway:4000/v1/chat/completions \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "protolabs/qwen-image-chat",
"messages": [
{"role": "user", "content": "a watercolor of a cat in a hat, portrait"}
]
}'Request fields:
model(string, required) βprotolabs/qwen-image-chatmessages(array, required) β OpenAI multimodal chat format. Last user message text is the instruction; provider walks ALL messages for reference imagesextra_body.seed(int, optional) β fix the seedextra_body.negative_prompt(string, optional)
Response:
{
"id": "chatcmpl-protobanana-1777757327",
"object": "chat.completion",
"created": 1777757327,
"model": "protolabs/qwen-image-chat",
"choices": [{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": ""
}
}],
"usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
}content is a string with a markdown-embedded data URL. Markdown-rendering clients (Open WebUI, Slack, Discord, GitHub) display the image inline.
Multimodal request examples β
Multi-reference (2-3 images) β
{
"model": "protolabs/qwen-image-chat",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "put the character from image 1 in the outfit from image 2"},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,CHARACTER..."}},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,OUTFIT..."}}
]
}]
}Returns one composed image.
Edit follow-up (chat history with prior assistant image) β
{
"model": "protolabs/qwen-image-chat",
"messages": [
{"role": "user", "content": "draw a cat in a hat"},
{"role": "assistant", "content": ""},
{"role": "user", "content": "now make it blue"}
]
}Provider extracts IMG_A from the prior assistant turn; routes to EDIT.
Sticker / background removal β
{
"model": "protolabs/qwen-image-chat",
"messages": [
{"role": "user", "content": "draw a cat in a hat"},
{"role": "assistant", "content": ""},
{"role": "user", "content": "remove the background"}
]
}Returns transparent PNG.
Model aliases on the protoLabs gateway β
The protoLabs deploy exposes these aliases (yours may differ depending on how you configured model_list):
| Alias | Backed by | Operation | Use case |
|---|---|---|---|
protolabs/qwen-image | qwen_image_2512 | gen | Direct text-to-image |
protolabs/qwen-image-edit | qwen_image_edit_2511 | edit | Direct edit |
protolabs/qwen-image-chat | (auto-routes per turn) | gen/edit/multiref/bgremove | Default for chat clients |
protolabs/qwen-image-bgremove | bgremove_birefnet | bgremove | Direct sticker (commercial license) |
protolabs/qwen-image-bgremove-rmbg | bgremove_rmbg2 | bgremove | Direct sticker (RMBG-2.0, NC) |
Phase 4-7 will add:
protolabs/qwen-image-region-edit(Phase 4)protolabs/qwen-image-inpaint(Phase 5)protolabs/qwen-image-outpaint(Phase 6)
But end users should mostly just use protolabs/qwen-image-chat β the chat alias auto-routes.
Errors β
Standard HTTP status codes:
200β image returned400β bad request (invalidsize, missingprompt, etc.)404β model alias not inmodel_list401β bad API key408β ComfyUI workflow timed out (default 180s; raise viaextra_body.timeout)422β workflow validation failed (ComfyUI rejected); error body describes which node500β internal β see gateway / ComfyUI logs
Error body format follows OpenAI's error schema:
{
"error": {
"type": "server_error",
"message": "ComfyUI workflow abc123 failed: [{'node_id': '4', ...}]",
"code": null
}
}Rate limits + concurrency β
protoBanana doesn't enforce rate limits β the LiteLLM gateway and ComfyUI behind it do.
ComfyUI processes one workflow at a time by default. Concurrent client requests queue server-side. Typical wait time at 1Γ concurrency:
| Op | Cold load | Warm |
|---|---|---|
| Gen | ~30s | ~22s |
| Edit | ~30s | ~25s |
| Multi-ref (3 imgs) | ~40s | ~32s |
| Bg remove | ~5-10s | ~3s |
If you need true concurrency, run multiple ComfyUI instances on multiple GPUs and set up load balancing in front of the gateway.
SDK examples β
Python β openai SDK β
from openai import OpenAI
client = OpenAI(
api_key="your-litellm-key",
base_url="http://your-gateway:4000/v1",
)
# Text-to-image
resp = client.images.generate(
model="protolabs/qwen-image",
prompt="a watercolor of a cat in a hat",
size="1024x1024",
)
image_bytes = base64.b64decode(resp.data[0].b64_json)
# Conversational
chat = client.chat.completions.create(
model="protolabs/qwen-image-chat",
messages=[{"role": "user", "content": "a watercolor of a cat in a hat"}],
)
md = chat.choices[0].message.content
# md is ""TypeScript β openai SDK β
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.LITELLM_API_KEY,
baseURL: "http://your-gateway:4000/v1",
});
const chat = await client.chat.completions.create({
model: "protolabs/qwen-image-chat",
messages: [{ role: "user", content: "a watercolor of a cat in a hat" }],
});
console.log(chat.choices[0].message.content);Bash β curl (smoke tests) β
See INSTALLATION.md Β§6.