CodeDocumentation¶

CodeDocumentation generates a complete set of Markdown documents from source code repositories. The output is a first draft that is committed to the project repository and refined manually from there. The application processes Python and PHP projects supplied as a local directory, a ZIP file, or via the GitLab API, and combines deterministic code analysis with two separate language model stages: a fast model for parallel per-file analysis and a thinking model for the cohesive generation of the documents.

At a glance¶

Generate a complete documentation set in a single run, including README, architecture, API reference, configuration and installation guide, and highlights
Provide the source code as a local directory, an uploaded ZIP archive, or directly from GitLab, without checking out the code locally
Cover multi-language projects (Python and PHP, individually or combined) in a single coherent document set
Capture endpoints and routes from FastAPI, Flask, Laravel, and Symfony codebases automatically, with method, path, parameters, and the associated handler
Connect to two separately configurable language model endpoints for analysis and generation, including endpoints with thinking mode
Preview generated documents in the interface as rendered Markdown and download them as a single ZIP archive
Trace each run via an accompanying generation report with runtime, model calls, and token usage

Highlights¶

In contrast to a direct LLM prompt over a codebase, CodeDocumentation separates deterministic structural analysis, parallelised per-file evaluation, and narrative text generation into a multi-stage pipeline. This separation improves the accuracy of the captured structures, keeps runtime manageable for medium-sized projects, and allows different models to be selected for different tasks.

Three-stage pipeline instead of a monolithic prompt — A deterministic inspection phase captures languages, frameworks, dependencies, and configurations without invoking a model. Only the subsequent phases call language models. As a result, basic factual statements do not depend on a model output.
Two separately configurable language models — A fast model processes file analyses in parallel; a slower model with thinking mode generates the prose sections of the documents. Both endpoints are configured independently; their interaction reduces overall runtime considerably compared to a thinking-only run.
Framework-specific extractors — Dedicated extractors for FastAPI, Flask, Laravel, and Symfony produce structured endpoint lists including method, path, path parameters, and handler. Other Python and PHP projects are covered by a generic regex-based extractor.
Multi-language projects as the standard case — Codebases with mixed stacks such as Flask plus Laravel are handled in a single coherent document set. Languages and frameworks are weighted, and endpoints from multiple sources are merged.
Three input sources — Local directory, ZIP upload, and GitLab repository via REST API, including the choice of branch or tag.
Session-bound token handling — GitLab personal access tokens remain exclusively in the interface session state. They are not written to disk, not included in logs, and not embedded in the generated documents.
Endpoint deduplication and merging — Endpoints originating from deterministic extraction and from model analysis are deduplicated by method and path and grouped in a single API catalog.
Mermaid diagrams without manual rework — Architecture and component overviews are embedded directly as Mermaid code in the documents and rendered in the preview.
Extensibility via a documented interface — New frameworks are integrated by adding a new extractor class and a detection entry, without changes to the rest of the pipeline.
Traceable runs — Each generation run additionally produces a report with runtime, number of model calls per model role, and tokens consumed.