p101

2026-05-12 10:56:50 +02:00
parent feaec9bbaf
commit 6dced1c4df
7 changed files with 1409 additions and 5 deletions
--- a/patch_history/RETRIEX_PATCH_100D_ADMIN_EVAL_PROMPT_CONTEXT_README.md
+++ b/patch_history/RETRIEX_PATCH_100D_ADMIN_EVAL_PROMPT_CONTEXT_README.md
@@ -0,0 +1,44 @@
+# RetrieX Patch p100d – Admin Eval Prompt Context
+
+Status: patch-only follow-up for p100 Admin Eval UX.
+
+## Goal
+
+Make eval results easier to understand in the Admin UI by showing the actual case prompt directly next to the case id. For follow-up and shopquery cases, show a compact history/context preview as well.
+
+## Changes
+
+- Admin eval result table now displays the case prompt below the case id.
+- Follow-up/shopquery eval details now include a compact history preview.
+- Admin eval result table shows history/context in a collapsible section when available.
+
+## Files changed
+
+- `src/Eval/ShopQueryEvalRunner.php`
+- `templates/admin/evals/index.html.twig`
+
+## Non-goals
+
+No production answer logic is changed:
+
+- no retrieval logic changes
+- no shopquery logic changes
+- no follow-up logic changes
+- no answer-guard logic changes
+- no eval assertion changes
+- no YAML or parameter changes
+- no database migration
+
+## Validation
+
+Recommended after applying:
+
+```bash
+php bin/console mto:agent:config:validate
+php bin/console mto:agent:eval:run retrieval
+php bin/console mto:agent:eval:run shop_query
+php bin/console mto:agent:eval:run followup
+php bin/console mto:agent:eval:run answer_guard
+```
+
+Then open `/admin/evals/` and verify that each result row shows the case prompt and that follow-up/shopquery rows can reveal context/history.
--- a/patch_history/RETRIEX_PATCH_101_ADMIN_EVAL_CASE_CREATOR_README.md
+++ b/patch_history/RETRIEX_PATCH_101_ADMIN_EVAL_CASE_CREATOR_README.md
@@ -0,0 +1,66 @@
+# RetrieX Patch p101 - Admin Eval Case Creator
+
+## Ziel
+
+p101 ergänzt die bestehende Admin Eval Suite um einen kleinen Case-Creator, damit neue Regression-Cases direkt aus dem Admin heraus in die passenden NDJSON-Dateien geschrieben werden können.
+
+Der Patch baut auf dem grünen p100/p100a/p100b/p100c/p100d-Stand auf und verändert keine produktive RAG-, Shopquery-, Follow-up- oder Antwortlogik.
+
+## Änderungen
+
+- Neue POST-Route im Admin:
+  - `/admin/evals/case/create`
+  - Route-Name: `admin_evals_case_create`
+- `EvalAdminService::createCase()` zum validierten Schreiben neuer Eval-Cases.
+- Neues Formular auf `/admin/evals/`:
+  - Eval-Typ
+  - Case-ID
+  - Prompt
+  - Assert-JSON
+  - optionales History-JSON
+  - optionaler Request Context Hint
+- Button pro Report-Result:
+  - `Als neuen Case vorbereiten`
+  - übernimmt Prompt, Typ, History-Vorschau, Query oder Dokument-ID als Vorlage in den Creator.
+- JSON-/ID-Validierung vor dem Schreiben.
+- Duplicate-Guard über alle Eval-Typen.
+
+## Geschriebene Dateien
+
+Neue Cases werden an folgende Dateien angehängt:
+
+- `tests/evals/cases/retrieval.ndjson`
+- `tests/evals/cases/shop_query.ndjson`
+- `tests/evals/cases/followup.ndjson`
+- `tests/evals/cases/answer_guard.ndjson`
+
+## Sicherheit / Scope
+
+Nicht geändert:
+
+- keine Retrieval-Gewichte
+- keine Shopquery-Logik
+- keine Follow-up-Logik
+- keine Answer-Guard-Logik
+- keine Prompt-/YAML-/Parameteränderung
+- keine Migration
+
+## Manuelle Prüfung
+
+```bash
+php bin/console mto:agent:config:validate
+php bin/console mto:agent:eval:run retrieval
+php bin/console mto:agent:eval:run shop_query
+php bin/console mto:agent:eval:run followup
+php bin/console mto:agent:eval:run answer_guard
+```
+
+Zusätzlich im Admin:
+
+1. `/admin/evals/` öffnen.
+2. Einen Eval laufen lassen.
+3. Bei einem Result `Als neuen Case vorbereiten` klicken.
+4. Case-ID anpassen bzw. prüfen.
+5. Assert-JSON prüfen.
+6. Speichern.
+7. Den betroffenen Eval-Typ erneut laufen lassen.