Umfrage-Analyse-System¶

Umfrage-Analyse-System is a web-based application for the structured evaluation of survey data at universities and research institutions. The application processes surveys with mixed question types — from single choice to free text — and turns raw data into reproducible reports. A defining feature is the multi-stage, LLM-supported pipeline that chains translation, item extraction, thematic clustering and textual summaries. Intermediate results are persisted, so every processing step remains traceable and can be corrected manually.

At a glance¶

Import and clean survey exports including multilingual responses and LimeSurvey structure files
Convert free-text responses into thematic clusters without manual coding, with the option to rework them
Segment evaluations by country, institution type and size — including for free-text clusters
Have statistical significance and effect size computed automatically
Generate structured reports as Word documents with charts, detail tables and cluster examples
Provide data for an accompanying dashboard website in a machine-readable format
Generate evaluation texts (description, interpretation, segment comparison) per question automatically

Highlights¶

In contrast to a direct LLM prompt or an ad-hoc script, the application captures the entire evaluation process as a persisted, multi-stage pipeline. Every stage can be resumed, reviewed and corrected manually — essential for the reliability of the results.

Seven question-type handlers in one tool. Single choice, multiple choice, yes/no matrices, Likert scales, rankings, free text and cooperation matrices are processed via a common registry, with question-type-specific aggregation and visualisation.
Workbench for cluster quality assurance. Cluster names, descriptions and assignments can be edited after the fact; items can be renamed, merged, split or reassigned. The data can subsequently be re-clustered against the corrected categories.
Multi-stage LLM pipeline instead of single prompts. Translation, item extraction (a hybrid of rule-based and LLM-supported methods), clustering with dynamic min/max thresholds, cluster example selection and per-question summaries are separate, parametrised stages with their own persistence.
Segmented evaluation also for free text. The breakdown by country, institution type and size is performed not only for closed questions but also for free-text clusters — including comparison tables per cluster.
Statistical grounding. Chi-square tests, Cramér's V as effect size and a correlation analysis between questions are integrated into the evaluation and reporting flow; sample warnings flag critical segment sizes.
Automatically generated evaluation texts. For each question, three text blocks — distribution description, interpretation of notable findings and a note on significant segment differences — are produced in three languages and cached in the database.
Multilingual output with a domain glossary. Reports and charts are produced in German, French and English; paragraph-wise translations use a configurable glossary for consistent terminology.
Accompanying website via dashboard export. All analysis results including segmented data, correlations and theme assignments are exported as JSON and can be read by a separate dashboard application.
Connection to three sources and services. Tabular survey exports (CSV, Excel), LimeSurvey structure files (.lss) and an OpenAI-compatible LLM API.
Resumable batch processing. Long pipeline runs can be continued after an interruption; previously computed analyses, translations and clusters are loaded from the database rather than recreated.