Features¶
The application provides a self-contained workflow from ingestion of source documents to a finished, exportable slide structure. The focus lies on dialogue-driven editing of the outline, a strict binding of the content to the uploaded sources, and functions that directly address typical follow-up tasks — adjusting the slide count to the duration, switching languages, reverting changes.
Use cases¶
- Deriving lecture slides from a script — From a multi-page lecture or seminar script, a slide structure is produced that follows the chapters of the script. By specifying the talk duration, the slide count is matched to a scope appropriate for the planned session.
- Preparing a conference talk from a paper — From a scientific paper, a compact slide outline is extracted that concentrates on the central statements. Speaker notes hold supplementary details that do not belong on the slides.
- Adjusting an existing presentation to a new duration — An existing PPTX file is uploaded; the application shortens or extends the outline to the target slide count calculated from the new duration, using structural importance as a selection criterion.
- Generating training slides from a manual — From a manual or guideline, a training structure is derived. Through the dialogue, focus areas can be set and individual slides expanded as needed.
- Translating slides into another language — An already created or imported slide outline is translated in full into another language, including slide titles, bullet points, and speaker notes.
- Combining multiple source documents into a single outline — Several documents are uploaded together; the application merges their content into a continuous slide outline.
At a glance¶
- Import from five document formats (PDF, DOCX, PPTX, Markdown, text), export to three target formats (Markdown, PowerPoint, Word)
- Dialogue-based adjustment of the slide outline, with instructions taken from the ongoing chat
- Automatic adjustment of the slide count to the stated talk duration
- Full translation of the slide outline into another target language
- Version history with undo for the most recent five edits
- Token usage display with a configurable limit
- Session-based operation without persistent storage of content
Input formats¶
The application processes five document formats. Ingestion is handled by a unified parsing component that reads the document segment by segment, identifies structural elements (headings, lists, body text), and maps them to an internal Markdown representation. Before further processing, file size and format are validated.
- PDF — Text and structural elements are extracted; scanned PDFs can be processed when text recognition is enabled.
- Word (DOCX) — Heading levels, paragraphs, and lists are taken into the intermediate Markdown representation.
- PowerPoint (PPTX) — Existing slide decks are read; slide texts and speaker notes feed into the processing. The application is therefore also suitable for revising existing presentations.
- Markdown (MD) — Markdown files are read directly; the existing structure is largely preserved.
- Text (TXT) — Plain text files are processed line by line and converted into a flat list structure.
Output formats¶
The generated slide outline is held internally as a Markdown document and converted on demand into one of three target formats.
- Markdown — Direct export of the internal artefact. Suitable for version control, processing in other tools, or as a plain-text record.
- PowerPoint (PPTX) — Generation of a presentation file with separate slides per Markdown heading, bullet points and sub-bullets as enumerations, and speaker notes in the PPTX notes field. Markdown markers for bold, italic, and inline code are carried into the slides.
- Word (DOCX) — Generation of a Word document in which each slide is rendered as a section with heading, enumeration, and notes block. Suitable as accompanying documentation or as a printable template.
Quality assurance¶
Several mechanisms ensure that the generated outline remains bound to the uploaded sources and that typical sources of error — hallucinations, inappropriate slide count, lost previous states — are caught.
- Source binding of content slides — Bullet points on content slides are tagged during generation with an internal source reference. The system prompts exclude inventions on content slides; missing information is marked as a placeholder (format
[INFO BENÖTIGT: ...]). The internal references are removed before display. - Two-phase processing of the artefact agent — A first phase generates or updates the outline; a second phase checks the slide count against the target size and shortens or expands accordingly. When a language switch is requested, an additional translation phase is triggered.
- Length control — A target slide count is calculated from the talk duration. The system prompts encode strategies per talk length (overview, balanced, detailed) that feed into generation.
- Prioritisation during chunking — Long documents are split into sections and each section is given a priority score. The score takes into account position within the document, heading level, length, and the occurrence of multilingual key terms. When the token budget is tight, high-priority sections are processed preferentially.
- Version history with undo — The most recent five artefact versions are retained. Each change carries a short diff description; an undo function restores the previous state.
- Token budget — The token usage of documents, chat history, and current artefact version is summed continuously and checked against a configurable limit. An overrun is detected and reported on upload.
- Markdown post-formatting — After every update, the Markdown structure is normalised: consistent bullet markers, correct indentation of sub-bullets, removal of duplicate blank lines.
- Retry logic on API errors — LLM calls are retried with exponential backoff up to a configurable count on transient failures; on permanent failure the previous state is preserved.
Further functions¶
- Dialogue-based adjustment — Through the chat input, instructions concerning the outline can be expressed (add or remove a slide, set a focus, change the duration). The chat agent recognises, among other things, statements about duration, audience, focus topics, and target language from free-form input.
- Clarifying questions on missing information — When critical information (e.g. duration, audience) is contained neither in the document nor in the dialogue, the artefact agent formulates appropriate questions, which the chat agent embeds in the dialogue.
- Multilingual dialogue — Language indications are recognised in both German and English; further languages are supported in translation.
- Configurable thresholds — Maximum file size, maximum token budget, model name, and API endpoint are set via environment variables.