Skip to content

Recherche-Tool

Recherche-Tool is an application for autonomous, multi-stage research and analysis backed by large language models. A query is not handed to a language model in a single step but runs through an agentic pipeline of planning, search across multiple connectors, extraction, gap analysis, and synthesis. Beyond open web research, dedicated modes cover institution-internal queries, bibliographic work, and structured analyses.

At a glance

  • Produce research reports for open questions — from query to a structured report with embedded source links, all stages run automatically.
  • Run institution-internal research — people, organisational units, and web content of the university are evaluated jointly and cross-checked against the verified directory.
  • Validate bibliographies — existing reference lists are checked entry by entry against multiple academic databases, missing DOIs are added, and deviations are flagged.
  • Find literature for a research question — results are weighted by citations and screened for relevance.
  • Produce structured analyses — concept explainer, virtual peer review, decision analysis, research design, and literature review as dedicated modes.
  • Include local documents — uploaded PDF, Word, PowerPoint, and Excel files feed into the research as context.
  • Export reports — Markdown directly in the browser, Word documents with clickable hyperlinks and an appendix.

Highlights

In contrast to a direct LLM prompt or a plain search-engine query, Recherche-Tool delivers a reproducible, multi-stage run with traceable sources. The following properties distinguish the application from simpler alternatives and have a direct effect on the quality of the resulting reports:

  • Agentic pipeline with separated phases: Format agent, research plan, iterative search-and-fetch loop, fact extraction, gap analysis, and synthesis are kept apart. Each phase has its own prompt schema, so errors in one stage do not propagate into the next.
  • Dual-LLM architecture: Complex tasks such as planning and synthesis run on a primary model; parallelisable fact extraction runs on a separately configurable harvest model. Both can be assigned independent endpoints, context sizes, and concurrency limits.
  • Coverage of many sources: SearXNG metasearch, generic web scraper, GitHub, GitLab, a Solr index, an Elasticsearch index, the institution-internal directory of people and units, WebDAV, local files, and academic literature APIs (arXiv, CrossRef, OpenAlex, Semantic Scholar, DBLP, OpenLibrary).
  • Multilingual search: Search terms are produced per question and per language; the pipeline picks topic-relevant languages and avoids cross-language duplicates.
  • Iterative gap analysis: After each research round, a dedicated LLM step checks which plan questions remain unanswered and schedules targeted follow-up queries or additional URLs for the next round.
  • Hybrid people search with embedder and reranker: For directory queries the application combines full-text search, semantic vector similarity, and a cross-encoder reranker, complemented by a hard name filter that prevents fuzzy-match errors.
  • Cross-checking of person-related claims: Web findings that assign a person to a unit of the university are cross-checked against the verified directory; deviations are marked as unverified in the report.
  • Contradiction detection: Inconsistent statements across the extracted facts are flagged in the report so they do not disappear into a smoothed-out synthesis.
  • Plan confirmation before research: The generated plan, including the search strategy, can optionally be inspected and approved before the costly search and extraction phase begins.
  • Reproducible research runs: Every run is persisted on disk together with the report, sources, extracts, plan, search strategy, and metadata, and can be revisited later.