patch 17d

This commit is contained in:
team 1
2026-05-01 20:46:04 +02:00
parent f98de3c785
commit 7d8b1c9aa2
3 changed files with 71 additions and 3 deletions

View File

@@ -0,0 +1,61 @@
# RetrieX Patch 17d - CAL Exact Document Grounding Fix
## Ziel
Patch 17d korrigiert den nach Patch 17c weiterhin beobachteten Accuracy-Fehler bei Fragen wie:
```text
welche grenzwerte kann der testomat cal messen
```
Das System durfte weiterhin den generischen bzw. Testomat-808-nahen Bereich `0,02 °dH bis 5 °dH` auf `Testomat 2000 CAL` übertragen. Diese Übertragung ist fachlich falsch, wenn die CAL-spezifische Quelle diese Zuordnung nicht explizit belegt.
Zusätzlich werden die bereits begonnenen Chlor-/Schwimmbad-Grounding-Regeln erneut in den aktuellen Stand übernommen.
## Geänderte Dateien
- `src/Knowledge/Retrieval/NdjsonChunkLookup.php`
- `config/retriex/prompt.yaml`
## Details
### Exact document lookup
`NdjsonChunkLookup` erhält einen konservativen Fallback für Titelmatches mit fehlender numerischer Produktfamilie:
- `Testomat CAL` darf `Testomat 2000 CAL` matchen, wenn alle nicht-numerischen Titelanker passen.
- `Testomat 808 CAL` darf nicht allein wegen `CAL` auf `Testomat 2000 CAL` springen, wenn ein widersprechender Zahlenanker vorhanden ist.
Damit werden CAL-Fragen stärker auf das konkrete CAL-Dokument fokussiert, statt über breite semantische Treffer oder Testomat-808-nahe Indikatortabellen beantwortet zu werden.
### CAL-Grounding
Die Prompt-Regeln verbieten ausdrücklich:
- Übertragung von Testomat-808-Indikatortabellen auf Testomat CAL.
- Ausgabe des generischen Bereichs `0,02 °dH bis 5 °dH`, wenn dieser nicht explizit in einem CAL-Quellrecord belegt ist.
- Verwendung von 808-Indikatortypen wie `300`, `300 S`, `301`, `302`, `303`, `305`, `310`, `320`, `330`, `350` als CAL-Indikatoren.
Wenn keine CAL-spezifischen numerischen Grenzwerte belegt sind, muss das System das sagen, statt zu generalisieren.
### Chlor / Schwimmbad
Die Regeln unterscheiden weiterhin zwischen belegter Chlor-Messung und explizit belegter Schwimmbad-/Pool-Anwendung. Chlor-Messfähigkeit allein ist kein Beleg für Schwimmbad-Eignung.
## Erwartete Checks
Nach dem Einspielen ausführen:
```bash
bin/console mto:agent:config:validate
bin/console mto:agent:regression:test
bin/console mto:agent:config:audit-source --details
bin/console mto:agent:config:audit-patterns --details
```
## Manuelle Regression
Erneut testen:
- `welche grenzwerte kann der testomat cal messen`
- `ich würde gern chlor im schwinnbad messen`

View File

@@ -480,7 +480,7 @@ parameters:
- '- Use retrieved knowledge as highest priority for technical matching, thresholds, measurement principles, and technical explanation when it contains a matching product or fact.'
- '- If retrieved knowledge is silent or only contains unrelated products, but live shop results explicitly match the requested parameter/application, use the shop results and do not answer with a negative RAG-only conclusion.'
- '- If the user asks for Schwimmbad, Schwimmbecken, Pool, or typo-like pool wording, a product may only be recommended for that application when the same RAG or SHOP PRODUCT RECORD explicitly names that application. Chlor measurement alone is not proof of swimming-pool suitability.'
- '- If a product record proves Chlor measurement but not Schwimmbad, Schwimmbecken or Pool use, say exactly that distinction and avoid recommendation wording such as empfiehlt sich, geeignet für Schwimmbad, or Anwendung im Schwimmbad.'
- '- If a product record proves Chlor measurement but not Schwimmbad, Schwimmbecken or Pool use, say exactly that distinction and avoid recommendation wording such as empfiehlt sich, geeignet für Schwimmbad, Anwendung im Schwimmbad, or für Schwimmbäder.'
- '- For product-selection questions, a shop result proves technical suitability only when the same SHOP PRODUCT RECORD explicitly states the requested measurement parameter, application, or compatibility. Search ranking, generated query terms, generic category matches, and similar wording are not proof.'
- '- If the requested parameter appears only in the generated shop query, metadata, unrelated highlights, or another product record, treat suitability as unverified and say that the shop hit requires technical verification.'
- '- Do not convert p-Wert, m-Wert, minus m-Wert, alkalinity, acid capacity, or other water-treatment parameters into pH or pH-Wert unless the same source explicitly says pH or pH-Wert.'
@@ -542,8 +542,9 @@ parameters:
entry explicitly connects them.'
- '- If several devices or indicators are present, keep each device-indicator-range assignment separate and do not transfer an indicator from one product to
another.'
- '- For Testomat CAL or Testomat 2000 CAL threshold/range questions, do not answer with Testomat 808 indicator ranges or the generic 0,02 °dH to 5 °dH range unless a CAL source record explicitly contains that exact assignment.'
- '- Do not use phrases such as typical monitoring range, typical range, or common range for a named product when the provided source only proves another product variant or does not explicitly state the named product range.'
- '- For Testomat CAL or Testomat 2000 CAL threshold/range questions, use only source entries that explicitly name CAL or Testomat 2000 CAL in the same product record. Do not answer with Testomat 808 indicator ranges or the generic 0,02 °dH to 5 °dH range unless a CAL source record explicitly contains that exact assignment.'
- '- If the retrieved CAL-specific source records do not explicitly state numeric CAL threshold values, say that the exact CAL Grenzwerte are not belegbar from the provided sources instead of giving 0,02 °dH to 5 °dH as a typical range.'
- '- For Testomat CAL, do not transfer plain numeric 808 indicator types such as 300, 300 S, 301, 302, 303, 305, 310, 320, 330, or 350. CAL indicator statements must remain CAL-specific, for example TH-prefixed indicator codes when those are explicitly present in the CAL source.'
- '- If the source states only a threshold function, do not expand it into broader control logic.'
- '- If a detail is not explicitly stated in the provided sources, say so plainly.'
- '- Prefer short, source-close sentences over explanatory expansion.'

View File

@@ -272,6 +272,12 @@ final readonly class NdjsonChunkLookup
/**
* Allows prompts such as "Testomat CAL" to resolve a document titled
* "Testomat 2000 CAL" without also allowing conflicting model numbers.
*
* This is deliberately limited to titles with at least two non-numeric
* anchors and at least one numeric title token. A prompt with its own
* numeric token must match one of the title's numeric tokens; otherwise a
* user asking for e.g. "Testomat 808 CAL" could incorrectly resolve to
* "Testomat 2000 CAL".
*/
private function isConfidentTitleTokenMatchAllowingMissingNumeric(string $normalizedPrompt, string $normalizedTitle): bool
{