Features¶

Code Analyzer covers the path from source code capture through structured analysis to report generation in a single coherent tool. The functional scope is divided into two applications with clearly separated responsibilities: the LLM-based file analysis (scanner) and the aggregating evaluation dashboard with report generator (analyzer). Both are operated through a web interface.

Use cases¶

Documentation of an existing software project — An existing Java, PHP, or Python project is to be documented in full. The application captures all source files, describes the business purpose and capabilities of each class, and provides a consolidated overview by domain.
Architectural review — For a project that has not yet been documented systematically, the architectural style is determined, the layer distribution is presented, and the module or package structure is assessed. Anomalies such as redundant packages or missing layers are flagged.
Security and quality triage — Security vulnerabilities, outdated practices, code smells, and performance topics are detected per file, assigned a severity, and aggregated by category. Findings can be prioritised with concrete recommendations per issue.
API and interface inventory — REST endpoints and SOAP services are detected automatically, catalogued by path, method, and handler, and visualised as a tree structure. Inconsistencies in the API design become visible as a result.
Modernisation assessment — From the detected frameworks, patterns, and issues, a prioritised modernisation roadmap is derived that addresses updates, cloud migration, and API modernisation.
Report for the management level — A complete Markdown report is generated from the aggregated data via LLM, covering management summary, project metrics, architectural assessment, business functions, API landscape, quality risks, technology stack, prioritised recommendations, and a conclusion.

At a glance¶

Source code capture and LLM analysis for Java, PHP, and Python in one interface.
Four specialised LLM prompts per file (business, technical, interfaces, issues).
Seven-step deep analysis plus a stand-alone report generator at the system level.
Evaluation dashboard with thematic tabs for overview, business, interfaces, issues, architecture, and LLM analysis.
Connection to OpenAI-compatible LLM endpoints (Ollama, OpenAI API, vLLM, LM Studio).
Validation and re-analysis tools to safeguard completeness.
CSV export of capabilities and issues; Markdown export of the report.

Code capture and file analysis¶

The scanner traverses the project directory recursively and filters source files by language-specific patterns. Test files are excluded by named patterns. Each file is processed by four independent LLM calls, each with its own role and a structured JSON response format. The per-file analysis covers:

Business purpose and business capabilities, mapped to concrete methods.
Detected frameworks, design patterns, and internal as well as external dependencies.
REST endpoints and SOAP services based on language-specific annotations and decorators.
Issues with category (security, outdated, code smell, performance, best practices), severity, and concrete recommendation.

A static structural analysis precedes the LLM analysis: package and namespace trees are built, build modules are detected (Maven, Gradle, Composer, pip, Poetry, setup.py), and layers are assigned by named keywords (presentation, business, persistence, util, model, config). The architectural style is derived from this assignment, for example "Layered/MVC", "Hexagonal/Ports-and-Adapters", or "Simple MVC".

Evaluation and aggregation¶

The evaluation dashboard reads the YAML inventory and presents the results in the following areas:

Overview with a functional summary by domain.
Business with a capability table and a domain visualisation including critical functions.
Interfaces with an API catalogue and a hierarchical tree representation of the endpoints.
Issues with a searchable browser and an aggregation by category.
Architecture with a package overview, a module view, detection of similar package names (consolidation candidates), and rule-based observations.
LLM analysis with endpoint configuration, seven-step deep analysis, UI pattern assessment, and functionality gap analysis.
Report with a complete analysis report in nine sections (from management summary to conclusion), saveable as Markdown.

Connectors and data sources¶

Code Analyzer uses two external connections.

Source code on the file system — Java, PHP, and Python projects are read directly from a local directory. Language-specific include and exclude patterns control which files are taken into account. Test files are skipped via named patterns.
OpenAI-compatible LLM endpoint — Both file analysis and system-wide evaluation rely on an LLM endpoint accessed via HTTP. The OpenAI API itself is supported, as are compatible servers such as Ollama, vLLM, and LM Studio. Endpoint, model name, and API key are configured via environment variables or through the UI.

Import and export formats¶

Source code import — Direct file access to .java, .php, and .py. Test files are skipped via named patterns.
YAML as the intermediate format — One YAML file per source file is stored under analysis/files/<package>/<ClassName>.yaml, complemented by project_structure.yaml and summary.yaml with project metrics, layer distribution, modules, and issue statistics.
Markdown export — The complete analysis report is saved under analysis_exports/analyse_report_<timestamp>.md.
CSV export — Capability and issue lists can be exported as CSV from the dashboard.

Quality assurance features¶

Separation of structural analysis and LLM analysis — Layer and module detection is rule-based and runs before any LLM evaluation. This provides an objective reference alongside the interpretative LLM results.
Enforced JSON response format — Each LLM call uses a low temperature and a strictly structured JSON schema, which safeguards reproducibility and machine readability.
Multi-stage processing — File analysis (four prompts), system-wide deep analysis (seven prompts), and report generation are separated. Intermediate results can be inspected at each stage.
Validation tab — Checks the YAML inventory for missing or incomplete files and reports discrepancies between summary and file inventory.
Re-analysis of missing files — A CLI tool identifies missing YAMLs based on summary.yaml and re-runs the analysis in a targeted manner.
Diagnostic tool — A second CLI tool analyses the causes of missing or faulty YAMLs.
Persistent intermediate storage — All LLM responses are persisted as YAML and can therefore be tracked and reviewed via version control.

Further features¶

Connection test — The "LLM analysis > Connection" tab verifies the reachability of the LLM endpoint and lists the available models.
Consolidation hints — The architecture tab detects similar package names that are candidates for consolidation.
API hierarchy visualisation — REST and SOAP endpoints are presented as a tree structure.
UI pattern analysis and functionality gap analysis — Two additional LLM-based evaluations identify dominant API patterns as well as functional gaps and redundancies.