Workshop report#
The workshop#
At the end of February 2025, the ‘AI in Practice’ workshop took place at TU Darmstadt post on LinkedIn.
As part of this event organised by the ZKI Working Group on Strategy and Organisation, there was also a discussion on whether the existing brainstorming formats for collecting ideas and opinions in the form of analogue or digital note collections could be further developed. In addition, there was a clear need to compile and map existing AI activities, in particular the services already offered in the field of generative AI.
On the way back, a clickable prototype was implemented using an LLM in about two hours. Start prototype in browser
The question, how much time would be needed to complete the implementation using AI felt interesting as well as the question, whether this complexity of application could already be addressed by LLM coding.
This workshop report aims to shed light on the answer to this question.
First things first: you are visiting this website and the project was successfully implemented. The application ultimately works as intended. However, this required more work than initially hoped for. Without LLMs, implementation would not have been possible under any circumstances.
Figures on the results:#
| File extension | Number of files | Number of lines |
|---|---|---|
| Python | 29 | 8,542 |
| HTML | 15 | 6,249 |
| CSS | 4 | 1,660 |
| Javascript | 4 | 716 |
| Configuration files | 9 | 365 |
| TOTAL | 61 | 17,532 |
Information on time spent:#
| Task | Time spent [h] | |—————————————— —————|—————–| | Coordination of requirements | 1 | | Discussion of technical architecture | 1 | | Test implementation of overall technical approach | 1 | | Database schemas | 1 | | OpenAPI - CRUD | 2 | | Crawler and website interpretation | 2 | | Template system and static site generator | 2 | | LLM client | 2 | | Development of design concept and UI | 1 | | Implementation and troubleshooting of UI | 2 | | Prompts and profile derivation | 2 | | Admin editing area, workflows | 2 | | Overview page and keyword cloud for projects | 2 | | Discussion on security aspects of deployment | 1 | | CORS, CSRF | 1 | | Deployment 2 nginx configurations, database, Uvicorn | 3 | | Docker, Docker compose, API routing and troubleshooting | 2 | | Certbot integration | 2 | | Network configuration, ports, domain filters, rate limits | 1 | | Planning and implementation of documentation solution | 1 | | TOTAL | 32 |
The LLMs estimated that a traditional agency would need 180-200 hours to implement the solution.
What were the biggest hurdles?#
The project is too extensive for today’s large language models (LLMs), as only a small portion of the entire code can be processed at a time. The developer must keep the content context, the file-level context, the interaction of the technologies used, and the basic structure of the application completely in mind. This also means that such developments are not possible without at least a basic knowledge of current technologies and their interrelationships.
Not only do LLMs often generate faulty code, but, to put it in human terms, they cheat, try to avoid difficult tasks, take shortcuts that can never work, or simply fake individual parts of the code so that it appears to be the correct solution at first glance. These shortcuts are not always immediately apparent when the code becomes more complex, so every output from LLMs must always be checked carefully.
LLMs often get sidetracked by false leads or solutions that are either too complex for them or cannot be completed by them into a workable overall solution. In these cases, the entire partial approach must be unwound in order to implement a different approach.
LLMs tend to implement trick solutions for tasks that are too difficult, e.g. with mock data, which can make the overall system appear to be executable in the short term.
In some cases, LLMs choose technologies to solve problems that are unnecessarily complex, unsuitable for the specific problem, or simply outdated. If this unfavourable choice of technology for problem solving is not recognised, the programming project will not be successful.
Deploying the application on servers represents a significant additional effort, for which LLMs can only provide limited support.
Which LLMs were used for the project:#
Anthropic Claude Sonnet 3.7 and 4.0 OpenAi GPT-4o and GPT-4o mini Mistral Large2 GLM-4-0414 Model Series Cohere Command-R