Skip to content

protoBananaChat-native image gen + edit. Open-source. Local.

The OSS counterpart to Google's Nano-Banana 2 / OpenAI's GPT-Image-2 โ€” served as an OpenAI-compatible LiteLLM provider on top of ComfyUI.

protoBanana mascot โ€” generated by protoBanana itself

What it is, in one sentence โ€‹

A LiteLLM CustomLLM provider that exposes ComfyUI workflows as OpenAI-compatible image endpoints, with per-turn intent routing for the full nano-banana conversational UX.

What you get โ€‹

# In your chat client (Open WebUI, protoCLI, or raw OpenAI SDK):

  user: a watercolor of a cat in a hat, portrait
  [image: cat in hat, 832ร—1216]

  user: now make it blue
  [edited image]

  user: remove the background
  [transparent png]

  user: change just the hat to red
  [masked region edit โ€” Phase 4]

One model alias (protolabs/qwen-image-chat) handles all of it. The provider walks message history, classifies the operation per turn, dispatches to the right ComfyUI workflow.

When to use it โ€‹

Use protoBanana whenUse Nano-Banana 2 / GPT-Image-2 when
Data sovereignty / compliance / IP sensitivityYou don't care where the data goes
You want fixed cost (electricity) at scaleYou're under metered-API-call budgets
You need to extend with custom workflowsFrontier-quality output is non-negotiable
You already run a LiteLLM gatewayYou don't have GPU infrastructure

For most teams: both. Use the closed APIs for one-off best-quality work, route bulk + sensitive workflows through protoBanana.

Where to go next โ€‹

  • New here? โ†’ Quickstart (5 min)
  • Setting up the full stack? โ†’ Installation
  • Curious about the design? โ†’ Architecture
  • The whole story (research โ†’ broken integrations โ†’ repo extraction)? โ†’ Journey
  • Roadmap (Phases 4-7 queued)? โ†’ Phases

Apache-2.0 licensed. Docs follow the Diรกtaxis framework.