Skip to content

CodeDocumentation

CodeDocumentation generates a complete set of Markdown documents from source code repositories. The output is a first draft that is committed to the project repository and refined manually from there. The application processes Python and PHP projects supplied as a local directory, a ZIP file, or via the GitLab API, and combines deterministic code analysis with two separate language model stages: a fast model for parallel per-file analysis and a thinking model for the cohesive generation of the documents.

At a glance

  • Generate a complete documentation set in a single run, including README, architecture, API reference, configuration and installation guide, and highlights
  • Provide the source code as a local directory, an uploaded ZIP archive, or directly from GitLab, without checking out the code locally
  • Cover multi-language projects (Python and PHP, individually or combined) in a single coherent document set
  • Capture endpoints and routes from FastAPI, Flask, Laravel, and Symfony codebases automatically, with method, path, parameters, and the associated handler
  • Connect to two separately configurable language model endpoints for analysis and generation, including endpoints with thinking mode
  • Preview generated documents in the interface as rendered Markdown and download them as a single ZIP archive
  • Trace each run via an accompanying generation report with runtime, model calls, and token usage

Highlights

In contrast to a direct LLM prompt over a codebase, CodeDocumentation separates deterministic structural analysis, parallelised per-file evaluation, and narrative text generation into a multi-stage pipeline. This separation improves the accuracy of the captured structures, keeps runtime manageable for medium-sized projects, and allows different models to be selected for different tasks.

  • Three-stage pipeline instead of a monolithic prompt — A deterministic inspection phase captures languages, frameworks, dependencies, and configurations without invoking a model. Only the subsequent phases call language models. As a result, basic factual statements do not depend on a model output.
  • Two separately configurable language models — A fast model processes file analyses in parallel; a slower model with thinking mode generates the prose sections of the documents. Both endpoints are configured independently; their interaction reduces overall runtime considerably compared to a thinking-only run.
  • Framework-specific extractors — Dedicated extractors for FastAPI, Flask, Laravel, and Symfony produce structured endpoint lists including method, path, path parameters, and handler. Other Python and PHP projects are covered by a generic regex-based extractor.
  • Multi-language projects as the standard case — Codebases with mixed stacks such as Flask plus Laravel are handled in a single coherent document set. Languages and frameworks are weighted, and endpoints from multiple sources are merged.
  • Three input sources — Local directory, ZIP upload, and GitLab repository via REST API, including the choice of branch or tag.
  • Session-bound token handling — GitLab personal access tokens remain exclusively in the interface session state. They are not written to disk, not included in logs, and not embedded in the generated documents.
  • Endpoint deduplication and merging — Endpoints originating from deterministic extraction and from model analysis are deduplicated by method and path and grouped in a single API catalog.
  • Mermaid diagrams without manual rework — Architecture and component overviews are embedded directly as Mermaid code in the documents and rendered in the preview.
  • Extensibility via a documented interface — New frameworks are integrated by adding a new extractor class and a detection entry, without changes to the rest of the pipeline.
  • Traceable runs — Each generation run additionally produces a report with runtime, number of model calls per model role, and tokens consumed.