LLM-supported rapid prototyping: Results from the development of a content migration tool#

Introduction#

This article documents observations and results from an experimental project to develop a web-based tool for content migration. The project arose in the context of an upcoming migration from Plone to TYPO3 and served primarily to explore the feasibility of an idea: Can a functional demonstrator be developed that processes JSON exports from Plone, improves them through LLM analysis, and exports them in various formats – and how long does it take?

The project is part of a series of LLM-supported development experiments designed to provide methodological insights for similar projects.

Experimental context#

Initial situation and motivation#

The development of the tool arose from a discussion about an upcoming content migration in the course of switching from Plone to TYPO3 as a content management system. This raised the question of whether and how this process could be supported by tools. Instead of directly developing a productive solution, an exploratory approach was deliberately chosen: a rapid prototype was to show whether the basic idea would work.

The primary learning objective was to test a specific development approach: Can LLM support be used to quickly develop a functional tool that fulfils a clear task? How does this approach work in practice? What challenges arise?

Expectations and intention#

From the outset, it was clear that this was a demonstrator. The attitude was: ‘We wanted to see how it works and test it. Then we’ll see.’ This open-minded approach without a fixed production intention proved to be advantageous – it enabled focused exploration without the pressure of having to cover all possible production scenarios immediately.

Technical implementation#

Architectural decisions#

The chosen technical architecture was deliberately based on already known technologies:

Core components:

Python as the base language
Gradio for the web interface
OpenAI-compatible API for LLM integration
Docker for deployment

This choice was motivated by pragmatism: previous experience with these technologies enabled faster implementation. An important early decision concerned the import format. Consideration of which format was most suitable quickly led to the choice of JSON. This had a direct influence on the tool architecture: instead of processing individual pages, the tool processes entire sub-websites as structured JSON files.

Modular structure#

From the outset, the application was structured into three main modules:

main.py (~800 lines): Contains the Gradio-based user interface and orchestrates the overall process. The module handles file uploads, JSON parsing, hierarchy recognition of Plone content and coordinates the export functions.

llm_analyzer.py (~280 lines): Encapsulates the LLM integration. The module implements OpenAI-compatible API communication, retry logic with exponential backoff, timeout management, and two-stage prompt processing.

word_exporter.py (~850 lines): Implements Word export with extensive Markdown-to-Word conversion. The module was added as an extension and implements graceful degradation if the necessary libraries are not available.

This clear structure was planned from the outset and proved to be practical. The separation allows for independent further development of the individual components and facilitates understanding of the overall system.

JSON handling as a learning field#

A central technical learning field was the handling of large, structured JSON imports. The sub-websites vary considerably in size: some are only 200 KB, while others reach 10 MB or more. A concrete example is a sub-website on the topic of Wi-Fi with the following characteristics:

85 individual pages
7 different content types
Breakdown: 20 documents, 4 FAQ pages, 27 FAQ items, 9 files, 18 folders, 6 images, 1 link
File size: approx. 12 MB as JSON

An LLM with a context length of 256,000 tokens was used to process such large amounts of data. This large context size was necessary in order to be able to analyse entire sub-websites in a single pass.

Two-stage LLM processing#

The implementation uses a two-stage approach for content analysis:

Phase 1 – Structural analysis: The LLM analyses the entire sub-website and develops a concept for an improved structure. This analysis takes into account hierarchies, content types, redundancies and structural weaknesses.

Phase 2 – Content transformation: Based on the concept developed in phase 1, the existing content is restructured and improved.

This separation proved to be useful: instead of transforming content directly, an explicit concept is first created and then applied. All processing takes place exclusively in the application session memory.

Development process with LLM#

Specification phase#

Development began with a detailed specification, which took about 20 minutes. This specification covered both functional and technical requirements and was formulated and handed over in one piece – not developed iteratively.

Key content of the specification:

Clear description of the data structures to be processed (Plone JSON schema)
Functional requirements (JSON import, LLM analysis, export in various formats)
Technical framework conditions (Gradio-based UI, API integration, Docker deployment)
Non-functional requirements (error handling, performance considerations for large files)

An important aspect was the explicit consideration of the KISS principle already in the specification phase. By clearly focusing on the core functionality, the LLM was prevented from generating overly complex or overdesigned solutions.

Iterative implementation#

The actual implementation took about an hour and comprised three iterations:

Iteration 1 – Initial code implementation: Based on the specification, the LLM generated the basic structure of the three modules. This included JSON parsing, Gradio interface, LLM integration and basic export functions in Markdown.

Iteration 2 – Export extensions: In this phase, additional export formats were added (ZIP archives with individual files, Word export). This presented a number of challenges, particularly in handling ZIP archives and managing the volume of files. A sub-website with 85 pages generates a correspondingly large number of individual files, which had to be taken into account in both path management and Windows path length restrictions.

Iteration 3 – UI refinements: The final iteration focused on minor adjustments to the user interface – primarily layout optimisations and improvements to the progress indicators for longer-running LLM operations.

LLM used#

Content analysis: Qwen/Qwen3-30B-A3B-Instruct is used for LLM-supported content analysis in the finished tool. This model was chosen because it works well with German texts and offers sufficient context length.

Documentation#

The README documentation was not created manually, but was generated automatically from the specification and the generated code. This was done within the specified implementation time of one hour. This approach ensured consistency between the code, specification and documentation.

Methodological findings#

Confirmation of an established workflow#

A key observation from this experiment was the confirmation of a workflow already developed in previous projects. The combination of detailed, single-piece specifications and close coordination of the technical implementation led to a fast and reliable result. This approach worked well in this specific case – further projects will show whether it can be transferred to other contexts.

Importance of detailed specifications#

The specifications were deliberately formulated in a comprehensive and detailed manner. Both dimensions were explicitly elaborated:

Functional level: What should the tool do? What data does it process? What outputs does it generate?

Technical level: What technologies are used? How is the architecture structured? Which libraries are used?

In a subsequent project, this approach was further developed by placing even greater emphasis on technical aspects and outsourcing them to a separate technical specification.

Value of previous experience#

The fact that already known technologies were used significantly accelerated development. The specification could be formulated more precisely because concrete assumptions could be made about the functionality and limitations of the frameworks used. This reduced the need for corrections during implementation.

This observation suggests that LLM-supported rapid prototyping may be particularly efficient when it builds on an existing technological foundation. Whether the approach works equally well when exploring completely new technology stacks would be an interesting question for further experimentation.

Session-based processing#

The decision to keep all data only in the application session without persistent storage proved to be advantageous. From a data protection perspective, this greatly simplifies the handling of sensitive content. As soon as the session ends, all processed data is automatically discarded – there is no database, no temporary files on the server, no remnants.

This aspect could be relevant for similar projects that process sensitive research data, survey results or unpublished materials. The technical implementation with Gradio supports this approach well, as the framework already provides suitable mechanisms for session management and temporary file handling.

Dealing with challenges#

Not all aspects of the implementation went smoothly. Three areas in particular required attention:

ZIP archives: Creating ZIP archives with a large number of files (85 pages result in at least as many individual Markdown files) presented challenges. In particular, Windows path length restrictions had to be taken into account.

Performance with large files: Sub-websites with 10+ MB of JSON data result in longer processing times by the LLM. This had to be communicated through appropriate progress indicators in the UI to provide feedback to users.

Export formats: The addition of Word export (in addition to Markdown and ZIP) was a subsequent addition in iteration 2. The implementation of a robust Markdown-to-Word converter proved to be more extensive than initially assumed – which is reflected in the file size of word_exporter.py with 850 lines.

These challenges led to the additional iterations mentioned above, but also show that the LLM was able to deal well with specific problems once they had been identified and communicated.

Validation and use#

Current evaluation status#

The tool is currently in an active evaluation phase. Several teams are testing it with real Plone exports from different areas of the university. The following aspects are being evaluated:

How well does the LLM analysis work with real, mature content structures?
Are the generated structure suggestions helpful for the actual migration?
What additional features would increase usability?
What are the limitations of the current approach?

This evaluation serves to prepare for the actual migration and helps to address the question of how it can be specifically supported. The feedback collected will be incorporated into considerations for a possible next stage of development.

From prototype to evaluation#

The fact that a one-hour rapid prototyping experiment resulted in a tool that is now being evaluated by various groups illustrates one of the potential strengths of this development approach: ideas can be quickly transformed into functional demonstrators, which can then serve as a basis for informed decisions.

Whether the tool will ultimately be used productively in any form is still open at this stage and depends on the evaluation results. The exploratory nature of the project allows for both scenarios – productive use or the realisation that a different approach is more suitable.

Transferability and limitations#

Potentially transferable principles#

Based on the observations from this project, some principles appear to be potentially transferable:

Detailed specification prior to implementation: The approach of first creating a comprehensive specification and then handing it over in one piece worked well in this case. This could be relevant for similarly structured projects.

Building on the familiar: The use of already familiar technologies enabled more precise specifications and faster implementation. This could be a relevant factor for projects with time constraints.

Session-based processing for sensitive data: Where data protection is an issue, the approach of only keeping data in the session could be useful.

Rapid prototyping as a basis for decision-making: Developing a functional demonstrator as a basis for further decisions proved to be feasible.

Limitations and open questions#

It is important to emphasise that these observations come from a specific context. Various factors could limit transferability:

Project size: The tool comprises around 1900 lines of code. Whether the approach would also work for significantly larger projects is an open question.

Previous experience: The people involved already had experience with the technologies used. It is unclear whether people without this previous experience would achieve similar results.

Problem complexity: The task to be solved – JSON processing, LLM integration, file export – is structured and clearly defined. The approach might work differently for less structured or more exploratory problems.

LLM capabilities: The observations are based on current LLMs. Future developments could change both the possibilities and the limitations.

This article is part of a series documenting methodological findings from various LLM-based development projects. The focus is on reproducible observations and transferable principles for similar projects.