From specification to code: TalkToDocuments#

In an ongoing series exploring LLM-assisted coding, methodological findings from various development projects are documented. The latest experiment: a local tool for direct interaction with documents using large language models with 256K token context.

The experimental goal#

After locally commissioning an LLM with a larger context window, the question arose: To what extent can this context be used in practice? How consistent do answers remain as the number of tokens increases? The tool developed was designed to answer these questions: Process multiple documents up to the context limit without classic chunking or vector databases, while making optimal use of the available space.

What the tool does#

The application processes up to 10 documents simultaneously in various formats (PDF, Word, Excel, PowerPoint, Text, HTML, CSV). After intelligent content cleaning, deduplication and Markdown formatting, the documents are kept directly in the LLM context. A simple reference system with [P1], [P2] tags enables source references. Processing takes place locally, with response times in the range of a few seconds.

Project details#

Approximately 90 minutes of development time was spent on the interactive creation of a detailed specification with the LLM, 60 minutes on code generation, and the rest on documentation and deployment. The result: 3,100 lines of code in 16 Python files, developed over a total of three days in three main iterations.

Most modules ran directly on the first run. The reason: the investment in specification quality paid off in error-free implementation.

Specification as an important success factor#

A functioning LLM specification does more than document functional requirements. It makes the connection between function, architecture and technical implementation explicit. This approach kept connections comprehensible and significantly improved implementability.

The quality of the specification developed continuously over several projects. With increasing clarity and depth of detail, the error rate in implementation decreased.

KISS (Keep It Small and Simple) as an active control task#

LLMs tend towards complex solutions that are probably contained in their training data but cannot always be replicated cleanly. This required active countermeasures during the specification creation process.

The implementation dispenses with complicated reference systems and uses a simple [P1], [P2] reference system instead of complex metadata structures, direct context transfer or other retrieval mechanisms. The appropriateness of proposed approaches was continuously questioned.

Structuring for LLM maintainability#

Modularisation followed a clear rule: no file over 1000 lines. The largest file (main application) comprises 800 lines, followed by the Word exporter with 400 lines and the LLM interface with 300 lines. This limit was based on practical experience that LLMs cannot yet reliably handle greater complexity.

The division was based on clear areas of responsibility: extractors, optimisers, LLM integration and UI as separate modules. Each area remained editable individually.

Opportunities and surprising strengths#

Greater complexity had to be broken down into manageable parts in this project. LLMs are not yet capable of performing this decomposition themselves. The consistency of the responses proved to be very good, even with higher token counts. The tool not only works reliably, but also quickly. Initial tests by various users provided good feedback, and productive use is planned after the trial phase.

Potentially transferable patterns#

Several aspects could be relevant for other projects: the integration of Tiktoken for token counting and visualisation, modularisation according to task areas with clear size limits, the focus on specification quality prior to implementation, and active control against over-engineering.

One insight for future projects: investing time in specification did not prove to be a delay, but rather an acceleration of overall development.


This is part of a series of documentation on methodological insights from LLM-supported development projects. The focus is on potentially transferable patterns and limitations, not on the tools developed themselves.