Quick overview: A tool for asking direct questions to PDFs and other files#

A tool that loads multiple documents simultaneously and answers questions about them directly. Without external servers, locally on your own computer.

The interesting thing about it: Not the tool itself, but what worked during development with AI support – and what didn’t.

What was the goal?#

To test how well AI models (large language models) can work with many documents at the same time.

We wanted to find out: Do the answers remain accurate when you enter a lot of information at once? How many documents can be processed simultaneously?

What can the tool do?#

Process multiple file formats (PDF, Word, Excel, PowerPoint, text and more)
Load up to 10 documents at once – without complex database technology
Ask questions in natural language (e.g. ‘What do the contracts say about notice periods?’)
Automatically create source references – with simple markings such as [P1], [P2] for the origin of the information
Work locally – all data remains on your own computer

How was it developed?#

The development process proceeded in clear steps:

Create specification (90 minutes): Detailed description of all functions developed in dialogue with the AI
Generate code (60 minutes): The AI used this to create the programme code
Documentation and provision (rest of the time): Write instructions and make the tool usable
Total effort: 3 days spread over 3 iterations
Result: 3,100 lines of code in 16 files

Most modules ran directly on the first run. Initial tests by various users were positive.

Why did it go so well?#

Because we invested a lot of time in creating a clear specification.

A good specification not only describes ‘What should the tool be able to do?’, but also ‘How do the parts fit together?’ and ‘Which technical decisions make sense and why?’.

This clarity before programming prevented many errors later on.

Important insights#

1. Specification before code saves time#

Time spent on a good specification is not a delay – it speeds up development.

The clearer the requirements are formulated, the less needs to be corrected later. In this project, the quality of the specification directly resulted in fewer errors.

2. AI models tend to be complicated#

AI models often suggest complex solutions that occur in their training data. This is not always the best choice.

We had to actively counteract this: instead of complex metadata structures, we chose a simple reference system with [P1], [P2]. Instead of complex database technology, we work with direct processing.

The question ‘Do we really need this?’ was a constant companion.

3. Small modules keep code maintainable#

We followed a clear rule: no file should exceed 1,000 lines of code.

The largest file (main programme) has 800 lines, followed by the Word export with 400 lines and the AI connection with 300 lines.

Why this limit? AI models cannot yet reliably overview large amounts of code. Smaller modules can be edited and understood individually.

What can others learn from this?#

Invest time in a detailed specification before the code is written
Actively question complicated solutions proposed by the AI (simple is often better)
Divide code into modules of less than 1,000 lines (this helps with maintenance)
Use token counting for visualisation (so you can see how much space is left)
Test early with real users (not just at the end)

Conclusion#

✔ Good preparation beats quick programming

✔ Simple solutions work better with AI support than complex ones

✔ Clear structure makes code maintainable and extensible

This is part of a series on experiences with AI-supported software development. The focus is on what can be learned from such projects – not just on the results.