From spoken to written language: How we developed a text processing tool#

We developed a tool that automatically converts transcripts into professional texts – with three consecutive processing steps. The key to success: focused individual tasks instead of complex all-in-one prompts.

What was the challenge?#

We were faced with a practical problem. An eight-hour training recording was available as a transcript.

Transcribed spoken language is different from written text. It contains filler words, repetitions and broken sentences. Manually processing eight hours of material? Not practical.

At the same time, we wanted to understand: How do you build multi-stage workflows with language models? Which approaches work for text processing?

How does the tool work?#

We developed a three-stage processing cascade:

Stage 1 – Cleaning: Correction of transcription errors, completion of broken sentences, removal of filler words (e.g. ‘um’, “so”, ‘so to speak’)
Stage 2 – Revision: Reformulation of colloquial expressions into factual language, improvement of text structure
Stage 3 – Formatting: Incorporation of headings, outline, Markdown structure (e.g. lists, paragraphs, highlights)

The result: ‘LLMs never say, I don’t know’ becomes ‘LLMs never say: I don’t know’ – identical in content, linguistically professional.

How did we develop it?#

The technical basis:

Front end (Gradio): ~400 lines for the user interface
Backend (Python with asyncio): ~500 lines for asynchronous processing
Job management (Gearman): ~200 lines for background jobs
Total effort: Approximately 6-8 hours spread over several weeks
Prompt optimisation: 3 hours (50% of development time)
Iterations: Approximately 10 main runs until the final version

We divide longer texts into sections based on characters. This keeps processing efficient.

Why did this approach work?#

Because we gave each processing stage a single, focused task.

Our initial attempts were different. We worked with a single, comprehensive prompt: ‘Clean up, revise and format the text.’ The result? Severely damaged and truncated texts. The language model attempted to perform too many transformations at once.

Then we changed our approach. Three separate prompts, each with a clear goal. This cascading allowed for careful, precise text editing – with minimal loss of information.

Key insights#

1. The development interface as an optimisation tool

We built two versions: one for development and one for users. The development version was crucial for optimisation.

It enabled us to:

Interactively adjust prompts during processing
View intermediate results at each processing stage
Experiment with iterations without deployment cycles

Only after optimisation in the development interface did we derive the final user interface. This allowed us to test and improve quickly.

2. The practical limit of AI-assisted development

Approximately 1,000 to 1,500 lines of code proved to be the practical limit without detailed specifications. Larger projects require more structured approaches.

It is important to be aware of this limit. Anyone planning larger tools should invest more time in advance planning.

3. Simple technical solutions are often sufficient

We used character-based chunking – a simple method. It was perfectly adequate for our purpose.

The lesson: you don’t always have to choose the most complex solution. Simple approaches often work better.

What can others learn from this?#

Focused single tasks lead to better results than complex multi-task prompts
Prompt optimisation takes time – in this case, half of the total development time
A development interface significantly speeds up optimisation
Simple technical approaches (e.g. character-based chunking) are often sufficient
The 1,000-1,500 line limit is a good benchmark for rapid development without specification

Conclusion#

✔ Cascading workflows with focused single tasks beat complex all-in-one approaches

✔ A development interface enables rapid optimisation and experimental work

✔ Simple technical solutions are often sufficient – complexity is not always necessary

This is part of a series on AI-assisted development. The focus is on what can be learned from such projects – not just on the results.