Features¶

Chart-Generator covers the full workflow from data ingestion through natural-language input to the production of interactive or static visualisations. Beyond chart generation itself, the feature set includes several mechanisms for ensuring the quality of LLM-produced results.

Use cases¶

Rapid data exploration. An uploaded CSV or Excel file can be explored immediately with requests such as "Show top 10 as a bar chart" or "Distribution as a histogram", without having to configure column names or aggregations manually.
Comparison across sheets. For Excel workbooks with multiple sheets, a single request ("Create charts for all sheets") produces a suitable chart per sheet, with the chart type derived from the data structure of each sheet.
Iterative refinement of a chart. An existing chart can be adjusted step by step in dialogue, for example "Colour the bars red", "Add percentage values", or "Change the title". The chart's underlying structure is preserved.
Correlation analysis. A request such as "Show a correlation heatmap" automatically computes a correlation matrix of the numeric columns and renders it as a heatmap.
Chart recommendations without prior knowledge. Immediately after file upload, three to five concrete suggestions are generated based on column types and sheet structure, each with a ready-to-use input phrase.
Preparation for reports. Finished charts can be exported as interactive HTML (for web publication) or as high-resolution PNG (for documents and presentations).

At a glance¶

Import formats: CSV, XLSX, XLS, with multi-sheet support for Excel.
Chart types: bar, line, area, scatter, pie, donut, histogram, box, violin, and heatmap.
Natural-language input, classified into create, modify, multi-create, or advisory.
Aggregations: sum, mean, count, minimum, maximum, median, and standard deviation.
Light and dark themes; six predefined colour schemes.
Export: interactive HTML, static PNG.
Chart history of the last ten charts per session.

Data sources and data preparation¶

Chart-Generator supports two input formats. On loading, column types are detected automatically and prepared for downstream steps.

CSV. Delimiters (comma, semicolon, tab, pipe) and character encoding (UTF-8, Latin-1) are detected automatically. Malformed lines are skipped before the file is converted into a DataFrame.
Excel (.xlsx, .xls). Multiple sheets are loaded in parallel. Per sheet, metadata are extracted (column types, statistics, sample values, missing values).

The loaded DataFrames are memory-optimised (numeric columns downcast, categorical columns detected, datetime columns parsed). Configurable upper limits for file size, row count, and column count prevent uncontrolled resource use.

Input handling and intent classification¶

Each input is first classified before code is generated. Four intents are distinguished: create a chart for a single sheet, create multiple charts (several or all sheets), modify the current chart, and advisory only without generation.

A trigger-word library and a decision tree support the classification.
Context information (available sheets, currently displayed chart) feeds into the recognition.
Each classification carries a reasoning string and a confidence score for traceability.

Chart generation¶

From the classified intent, an execution plan is created that specifies chart type, axes, grouping, aggregation, and title. The plan is then translated into Plotly Express code and executed.

Support for ten chart types with type-specific parameters (for example orientation and stacking mode for bar charts, hole ratio for donut charts, trend line for scatter plots).
Aggregations are applied to the DataFrame before plotting where required.
Standard interactivity (hover, zoom, pan, mode bar) is applied uniformly to all generated charts.

Modification of existing charts¶

Requests that modify an existing chart continue from the existing code. The chart's underlying structure is preserved, and only the aspect named by the user is changed (colour, title, labels, size, style, data). For semantic colour requests ("red for negative"), an explicit category-to-colour mapping is produced.

Recommendation system¶

Immediately after file upload, three to five concrete visualisation suggestions are derived from the data structure, each with a sheet reference, chart type, and a ready-to-use input phrase. A short confirmation ("ok") triggers automatic execution of all suggestions.

Quality-assurance features¶

Several mechanisms reduce errors and stabilise the results of LLM-based processing.

Pipeline separation. Intent recognition, planning, and execution are implemented as separate services with separate prompts. This makes it easier to localise sources of error.
Pattern and anti-pattern library. Templates exist for common chart and modification scenarios; known failure modes are documented as anti-patterns and included in the prompts.
Code validator. Before execution, the generated code is checked for semantic correctness, in particular for colour mappings (existence of referenced categories, use of color_discrete_map for semantic requests).
Retry logic. If execution fails, the error message is fed back to the LLM and the code is revised in up to three iterations.
Seaborn fallback. If repeated correction attempts cannot produce executable Plotly code, a simplified Seaborn-based chart is created instead.
Sandbox execution. The generated code runs with restricted built-ins and without import capability.
Column and type checks. Column names referenced in the generated code are validated against the actual DataFrame columns.
Traceability. Per chart, the request, intent, generated code, and library used are recorded; logging documents each pipeline step.

Export, sessions, and configuration¶

Export. Charts can be saved as interactive HTML (Plotly via CDN or embedded) or as PNG (configurable resolution and scaling).
Sessions. Per user session, DataFrames, metadata, the currently displayed chart, and a chart history (up to ten entries) are managed.
Themes. Light and dark themes can be switched during a session; six predefined colour schemes are available.
Configuration. All thresholds (file, row, and column limits, number of LLM calls, retry count, themes, export paths, server port, reverse-proxy path) can be set via environment variables.