Features¶

Bildgenerierung provides a unified interface for text-to-image and image-to-image generation. Its feature set covers the actual image generation, a two-stage preset system, an extended manual parameter control, and mechanisms for reproducibility and for reducing unwanted image content.

Application scenarios¶

Workshop and event posters: A short description of the occasion produces a print-ready poster layout with title and matching imagery. Through the image-to-image path, existing drafts can be reused as references and varied.
Project and research logos: Based on a thematic description, several minimalist logo variants can be generated in parallel and selected from afterwards.
Illustrative city and topic imagery: Scenic depictions (for example, isometric city views or abstract topical illustrations) can be produced in suitable resolutions and orientations for websites, slides, and internal materials.
Visualization of concepts and mockups: Product and packaging mockups, action-figure depictions, and other concept visualizations are generated from textual descriptions and can be refined further with reference images.
Iterative image variants: An existing image serves as a starting point for targeted modifications (style, composition, image elements). Reusing the result as a new reference image allows step-by-step convergence on the desired output.
Quick preview generation: A preview mode generates several variants at reduced resolution and with a small number of inference steps, allowing image ideas to be pre-filtered before committing to a high-resolution run.

At a glance¶

Text-to-image and image-to-image in a single interface
Connection to an inference service in the local infrastructure via the OpenAI API
Support for two interchangeable image models (Qwen-Image-2512, black-forest-labs/FLUX.2-dev)
Twelve predefined resolutions from 512×512 to 1664×1664, including landscape and portrait formats
Up to four images per run
Negative prompts, freely chosen seed, inference steps, and guidance scale
PNG output with timestamp-based filenames, downloadable directly from the gallery

Image generation¶

Image generation operates in two automatically adjusting modes. Without reference images, the application uses the OpenAI image generation endpoint (text-to-image). As soon as at least one reference image is provided, it switches to the chat completion endpoint (image-to-image), where reference images are passed as embedded image data within the request. Both paths use the same server and the same configured model.

Text-to-image: Generation from a plain text prompt, optionally with a negative prompt.
Image-to-image: Generation that takes one or more reference images into account. Multiple image references are passed alongside the prompt to the model.
Multi-image generation: Up to four images can be requested per run. In image-to-image mode, the seed is incremented per variant to produce different results.

Connection and models¶

The application communicates over HTTPS with an OpenAI-compatible inference service that is configured centrally.

Inference service in the local infrastructure: The target server is operated as part of the local infrastructure and is configured through base URL, API key, and model identifier. Inputs and generated images do not leave the local infrastructure.
Image model Qwen-Image-2512: When the model identifier Qwen/Qwen-Image-2512 is configured, the inference service routes requests to this model.
Image model black-forest-labs/FLUX.2-dev: As an alternative, the model identifier black-forest-labs/FLUX.2-dev can be configured. The interface and the call path remain identical.

Preset system¶

Two coordinated preset groups reduce the number of interaction steps for common tasks.

Mode preset: Sets resolution and image count together (for example, "Vorschau – 4 Bilder, 512×512", "Standard – 1 Bild, 1024×1024", "Hoch – 1 Bild, 1280×1280", "Sehr hoch – 1 Bild, 1664×1664").
Quality preset: Sets inference steps and guidance scale together (from "Entwurf – schnell, kreativ" to "Fein – detailliert, prompt-treu").

Extended parameters¶

All parameters are accessible individually in the "Weitere Optionen" accordion and override the preset selection. Re-clicking a preset resets the values.

Resolution: Selection from twelve predefined values between 512×512 and 1664×1664, including square, landscape, and portrait dimensions.
Image count: Between one and four images per run.
Inference steps: Between five and one hundred steps. Higher values increase the level of detail and the generation time.
Guidance scale: Between 1.0 and 20.0. Higher values lead to stricter prompt adherence; lower values allow freer interpretation.
Seed: Integer value; -1 generates a random seed.
Negative prompt: Free text describing unwanted image elements.
Reference images: Multi-file upload restricted to image file types.

Quality assurance features¶

Several mechanisms support reproducible and targeted results:

Negative prompts: Explicit exclusion of unwanted image elements reduces the share of unusable generations.
Reproducibility via seed: With a fixed seed, identical inputs yield identical images, so variations of individual parameters can be tracked systematically.
Iterative refinement: Generated images can be adopted as references for the next run with a single button, allowing a result to be refined step by step.
Image preprocessing: Reference images are scaled to a maximum edge length before transmission and encoded as JPEG (RGB) or PNG (with transparency) depending on color mode. This reduces request size and transfer time without visible quality loss.
Error handling: Connection failures, timeouts, and API error responses are caught with understandable messages, including hints about possible causes (server reachability, excessive steps, or oversized resolutions).

Image processing and output¶

Gallery view: All results of a run are shown together in a scrollable gallery.
Named files: Generated images are stored as PNG files with a timestamp and, where applicable, a sequential number, and can be downloaded individually.
Use as reference: A single click sets all results of a run as input for the next run.
Example prompts: The interface includes examples whose values are inserted directly into the input fields when clicked.