Specification as the key: An experiment in LLM-assisted development of a text style editor#

Part of a series on methodological insights from LLM-assisted development projects


Starting point and motivation#

As part of an ongoing exploration of LLM-assisted coding, an experiment was conducted to answer the question: How can text style transformation be implemented other than through complex style profiles? A previous project had pursued a profile-based approach in which style features were extracted from reference texts and applied to new texts. This approach worked, but it was complex and offered little direct control over individual style parameters.

The new experiment took a different approach: Instead of analysing and copying styles, users should be able to directly set desired style features using sliders. The primary goal was not to create a finished product, but to explore an alternative concept and further develop our own LLM coding methodology.

What the tool does#

The text style editor is a web application that transforms texts into different language styles. The concept is based on two steps: First, an input text can optionally be neutralised to remove stylistic idiosyncrasies. Then, 34 sliders in seven categories are used to set the desired style characteristics – from formality and emotionality to clarity and creative elements such as irony or poetry.

The sliders work in three ways: polar sliders span a spectrum (e.g. formal to informal), intensity sliders control the degree of a feature (e.g. irony from 0 to 10), and level sliders offer discrete options (e.g. language level A1 to C2). For a quick start, 23 presets are available, providing predefined controller combinations for typical use cases.

The development process#

Development followed a two-phase approach: first, a detailed specification phase in dialogue with various LLMs, followed by code generation in a single pass.

The specification comprised around 1,400 lines and 50,000 characters – almost as extensive as the resulting code itself. In this phase, goals, functional requirements, UI concept and technical details were developed in dialogue with the LLM. Only after this specification was completed was the code generated.

The entire development took about two hours: one hour for the specification, the rest for implementation, deployment and subsequent enhancements. The project was carried out on the side in one day and comprised three main iterations: the initial development, an adjustment of the intensity levels after initial testing, and the addition of a history function and settings export.

Methodological findings#

The key finding of this experiment: the quality of the specification determines the quality of the generated code. A precise, comprehensive specification enables code generation in a single pass, while a vague β€˜build me a tool for X’ leads to iterative correction loops.

The intensity levels of the controls (light, moderate, significant, strong, extreme) did not arise in the initial specification, but through testing. The first prompts generated did not implement the style characteristics clearly enough. The solution was once again developed in dialogue with the LLM – the understanding of the problem came from the developer, the technical solution from the collaboration.

Another aspect: the deliberately flat architecture of the new tool (six Python modules instead of a nested structure) was a direct consequence of experience with its more complex predecessor. Less abstraction meant faster development with comparable functionality.

Current status#

The tool is currently being evaluated by several people with different perspectives. Initial findings show that it also works well for finer style adjustments, not just for drastic transformations. However, simplification of the UI would be necessary for productive use – 34 controls are too complex for average users.

Comparison with the previous project#

The predecessor took a profile-based approach: style features were extracted from reference texts – such as texts by Goethe or Steve Jobs – and stored as JSON profiles with detailed linguistic metrics. Sentence length, passive voice usage, hedging level and connector frequency were analysed and quantified.

This approach was technically interesting, but had practical limitations: the extracted profiles were complex and difficult for users to understand. Although it was clear that a profile specified β€˜68% nominal style’ or β€˜28.4 words average sentence length’, there was no direct control. The new tool addresses precisely this point: users decide for themselves how formal, how emotional, and how complex their text should be.

It is noteworthy that both tools have similar code sizes despite their different concepts (predecessor: ~2,250 lines, new tool: ~2,000 lines). The predecessor analyses and synthesises, while the new tool only synthesises – nevertheless, active countermeasures against overengineering by the LLM were necessary.

Conclusion#

The experiment confirms an observation from previous projects: with increasing experience, specifications become longer and more precise. The ratio of specification effort to implementation effort shifts. The actual development work increasingly takes place in the design phase – code generation becomes the execution step of an already well-thought-out solution.

Another aspect deserves attention: dialogue with the LLM is not only useful for implementation, but also for developing the specification itself. The LLM acts as a discussion partner, questioning concepts, suggesting alternatives and helping to clarify them. The resulting specification is more comprehensive and consistent than it would probably have been if it had been developed solely in-house.

In practice, this means that the role of the developer shifts from implementation to design, from programming to precise requirements definition. Those who develop with LLM support should invest most of their time in the specification – not in correcting generated code.


Metrics: ~2,000 lines of code, ~500 lines of JSON configuration, ~1,400 lines of specification, ~2 hours of development time, 3 iterations