Features¶

Plone-Migration covers the full workflow from importing a Plone export to providing editorially usable results. The feature set is organised into data ingestion and structuring, export to several target formats, and an LLM-supported analysis and revision path.

Use cases¶

Preparing a CMS migration. Plone content is read in, ordered hierarchically, and provided as format-neutral Markdown so it can be reused in downstream import processes.
Structural analysis of large web sites. Page hierarchy, content types, and volume figures are surfaced from the export to identify consolidation and restructuring needs.
Restructuring of website content. Based on the LLM analysis, suggestions for an alternative, more ergonomic page structure are generated — including a rationale for the structural decisions.
Textual revision of selected pages. Existing content (such as help or service pages) is revised on the basis of the preceding structural analysis; existing links are preserved.
Preparation for stakeholder reviews. Migration and analysis results are exported as Word or Markdown documents and can be processed further in editorial or quality-assurance workflows without additional tooling.

At a glance¶

Input: Plone JSON export up to 100 MB
Output: single Markdown file, ZIP archive with per-page files, Word document, Word ZIP
External connection: one OpenAI-compatible LLM interface for analysis and text revision
Two-stage LLM workflow: content analysis followed by analysis-driven revision suggestions
Sample data loadable directly from the interface for onboarding and testing
Live preview of all LLM results as rendered Markdown
Statistical overview of content types and hierarchy levels after each import

Data ingestion and structuring¶

The application processes Plone exports in JSON format as the central input. During import, every page is captured with its metadata (UID, title, description, content type, language, creation and modification dates) and mapped to a uniform internal representation. The full parent-child hierarchy is reconstructed on this basis.

JSON import: Reads Plone JSON exports and validates the structure. Embedded HTML content from the text and answer fields is processed; Base64-encoded images are replaced by placeholders to keep Markdown output readable.
Hierarchy reconstruction: Restores the page hierarchy via Plone UIDs from parent references and detects root pages and references to missing parent nodes.
Content type detection: Identifies standard Plone types such as Document and Folder as well as further types contained in the export, without imposing semantic restrictions on them.
Statistical overview: Lists the number of pages per content type, the number of root pages, and the languages present in the export.

Connectors¶

The application currently provides one external connector and a file-based input/output channel:

OpenAI-compatible LLM interface (external). All analysis and revision steps run against an OpenAI-compatible Chat Completions API. Endpoint, model name, timeout, and retry count are configurable via environment variables; switching providers does not require code changes.
File import (input). Plone exports are uploaded as JSON files via the web interface; a direct connection to a running Plone system is not provided.
File export (output). All results — structure overview, prepared content, and LLM outputs — are made available as downloadable files.

Import and export formats¶

Input — Plone JSON. The Plone-generated JSON format is expected, with fields such as UID, @type, title, description, text.data, parent.UID, created, modified, and language. Files up to 100 MB are accepted without preprocessing.
Output — single Markdown file. A complete export of all pages in one context-rich Markdown file with structure overview, metadata, and HTML-to-Markdown converted content.
Output — ZIP archive (Markdown). One Markdown file per page in a directory structure mirroring the Plone hierarchy. A _structure.md file holds overviews and statistics.
Output — Word document. Optional export of the overall content or individual LLM results as a .docx file with metadata (title, author, category).
Output — ZIP archive (Word). Word variant of the hierarchical ZIP export.
Output — LLM result files. Content analysis and revision suggestions can be downloaded separately as time-stamped Markdown or Word files.

LLM-supported analysis and revision¶

The AI component operates in two stages. In the first step, an LLM analyses the prepared content, identifies positive and negative properties of the existing structure, and produces a justified proposal for an alternative outline. In the second step, the LLM uses this analysis as context and generates concrete revision suggestions for the individual pages, with existing links preserved.

Content analysis: Structural overview, qualitative assessment of existing content, and a justified proposal for an alternative page structure.
Text revision: Editorially usable Markdown suggestions generated on the basis of the analysis, with recognisable per-page mapping.
Configurable prompts: The system and user prompts used for analysis and revision are stored centrally in the code and can be adjusted.

Quality assurance¶

Multi-stage processing. Structural analysis and text revision are separated; the analysis result can be inspected before it feeds into the revision as context.
Traceability. Each stage — structure overview, statistics, analysis, revision — can be exported individually and is time-stamped.
Robust API connection. Retries with exponential backoff for transient errors (rate limit, timeout, connection failure); configurable timeouts.
Content length control. Very long inputs are truncated with an explicit notice rather than being cut off silently.
Path and filename sanitisation. Umlauts, special characters, and overlong paths are normalised for cross-platform processing; duplicate paths are disambiguated automatically.
Graceful degradation. Optional features (Word export) remain disabled when their dependencies are missing, without disrupting the main path.

Operation and preview¶

Web-based interface. Operated through a browser-based UI; no local installation on end-user devices required.
Live Markdown preview. LLM outputs are rendered directly.
Progress indicator. Longer processing steps are accompanied by a status indicator.
Sample data. A bundled Plone sample export can be loaded without providing own data, in order to walk through the workflow.