Skip to content

Architecture

The application follows a layered design with a clear separation between user interface, orchestration, agentic processing, code generation, and rendering. The central processing pipeline is built on the IPE pattern (Intent → Plan → Execute). The system runs in a container that bundles both the Python components and the Mermaid CLI rendering including a headless Chromium.

At a glance

  • Layered design: UI → event orchestration → IPE services → renderer → export.
  • IPE architecture: three independent services for intent recognition, planning, and execution.
  • Multi-agent LLM system: chat, diagram, and validation agents with their own LLM profiles.
  • Deterministic code generation: plan-to-Mermaid converter without LLM call.
  • Validation loop: pre-validation, LLM correction, and template fallback before rendering.
  • Session state with versioning, history, and persistence.
  • Container deployment with Mermaid CLI, Chromium, and optional subpath support.

Architecture description

Layers and components

The application is divided into several layers. The user interface is based on Gradio and contains chat, data panel, code editor, diagram gallery, and history. All UI events are handled through a central event handler that manages the session, passes the input data stream to the Smart Data Processor, and routes between a legacy processing path and the IPE pipeline depending on configuration.

At the heart of the IPE pipeline are three services. The Intent Service first analyzes the message in a rule-based manner via trigger words and domain-specific patterns, falling back to the LLM on low confidence. The result is a structured intent with diagram type, action (create, modify, extend, analyze), extracted structure (lanes, nodes, connections), and confidence value. The Plan Service transforms the intent into a complete diagram plan, fills missing lanes and activities from domain defaults for IT support, HR, and sales, and defines the connections between elements. The Execution Service deterministically converts the plan into code via the plan-to-Mermaid converter, validates it through the Validation Agent, and forwards it to the renderer factory.

Diagram

flowchart TD
    User([User]) --> UI[Gradio UI]
    UI --> EH[Event Handler]
    EH --> SM[State Manager]
    EH --> DP[Smart Data Processor]
    EH --> TL[Template Library]

    EH --> IS

    subgraph IPE [IPE pipeline]
        direction TB
        IS[Intent Service] --> PS[Plan Service]
        PS --> ES[Execution Service]
        ES --> P2M[Plan-to-Mermaid converter]
        P2M --> VA[Validation Agent]
    end

    subgraph LLMS [Multi-agent LLM system]
        direction LR
        CA[Chat profile]
        DA[Diagram profile]
        VP[Validation profile]
    end

    IS -.-> CA
    PS -.-> DA
    VA -.-> VP
    CA --> LLM[Internal LLM service]
    DA --> LLM
    VP --> LLM

    VA --> RF[Renderer Factory]
    RF --> MR[Mermaid renderer]
    RF --> DR[draw.io renderer]
    RF --> GR[Gantt renderer]
    MR --> MCLI[Mermaid CLI and Chromium]
    DR --> N2G[N2G library]
    GR --> PG[python-gantt]

    MR --> EM[Export Manager]
    DR --> EM
    GR --> EM
    EM --> Out[SVG, PNG, code, ZIP]

Workflow

A request travels through the application from top to bottom. First, the Gradio UI accepts the input and forwards it to the event handler, which manages the session via the State Manager and parses the data from the data panel through the Smart Data Processor into a canonical form. The event handler then routes the request into the IPE pipeline.

The Intent Service combines a rule-based quick analysis (trigger words, domain hints, connection types) with an optional LLM stage and returns an intent with a confidence value. The Plan Service decides the planning strategy based on the diagram type (lane-based, hierarchical, flowchart-specific, or generic), fills missing structures with domain defaults, and produces a complete plan with lanes, nodes, and connections.

The Execution Service converts the plan into Mermaid code via the plan-to-Mermaid converter. This conversion is rule-based without any LLM call and produces reproducible results. The generated code then passes through the Validation Agent, which first performs a pre-validation against known error patterns, triggers an LLM-based correction with a complete Mermaid syntax reference for any remaining errors, and falls back to a template if correction fails. The validated code is forwarded via the renderer factory to the appropriate renderer: Mermaid diagrams are rendered through the Mermaid CLI with a headless Chromium, draw.io diagrams via N2G, and Gantt diagrams via python-gantt.

Multi-agent LLM system

The application uses three independent LLM profiles, each with its own configuration. The Chat Agent handles the conversation, recognizes modification intents from the history, and steers the diagram type selection; it operates with a higher temperature for linguistic variety. The Diagram Agent is responsible for code generation in the legacy path and for JSON intermediate formats and operates with low temperature for deterministic output. The Validation Agent corrects syntactic errors and operates at temperature 0 for maximum reproducibility. All three profiles address the same LLM endpoint via an OpenAI-compatible API but are clearly separated in behavior and parameter profile.

Concurrency, robustness, and configuration

The State Manager maintains session-bound states including diagram versioning with parent relationships, so that modifications remain traceable. The validation loop limits the number of correction attempts and falls back to prepared templates on repeated failure. Configuration is loaded centrally from a YAML file; it controls server parameters (including an optional subpath for reverse-proxy deployments), the LLM profiles per agent, validation depth, the JSON intermediate format, the domain defaults, and renderer activation. The application can be operated fully containerized; the container image contains, in addition to the Python components, Node.js, Mermaid CLI, and a headless Chromium for rendering.

Technology overview

  • Language and runtime: Python 3.10
  • UI framework: Gradio
  • Data processing: pandas, python-dateutil
  • LLM connection: OpenAI-compatible HTTP API via requests and aiohttp; internal LLM service
  • Rendering: Mermaid CLI (mmdc) with headless Chromium, N2G for draw.io, python-gantt; plotly as alternative
  • Image processing: cairosvg (SVG-to-PNG), Pillow
  • Configuration: YAML
  • Containerization: Docker, Docker Compose
  • Protocols: HTTP/HTTPS for the LLM API, local subprocess calls for the Mermaid CLI