patch 17d
This commit is contained in:
61
RETRIEX_PATCH_17D_CAL_EXACT_DOCUMENT_GROUNDING_FIX_README.md
Normal file
61
RETRIEX_PATCH_17D_CAL_EXACT_DOCUMENT_GROUNDING_FIX_README.md
Normal file
@@ -0,0 +1,61 @@
|
|||||||
|
# RetrieX Patch 17d - CAL Exact Document Grounding Fix
|
||||||
|
|
||||||
|
## Ziel
|
||||||
|
|
||||||
|
Patch 17d korrigiert den nach Patch 17c weiterhin beobachteten Accuracy-Fehler bei Fragen wie:
|
||||||
|
|
||||||
|
```text
|
||||||
|
welche grenzwerte kann der testomat cal messen
|
||||||
|
```
|
||||||
|
|
||||||
|
Das System durfte weiterhin den generischen bzw. Testomat-808-nahen Bereich `0,02 °dH bis 5 °dH` auf `Testomat 2000 CAL` übertragen. Diese Übertragung ist fachlich falsch, wenn die CAL-spezifische Quelle diese Zuordnung nicht explizit belegt.
|
||||||
|
|
||||||
|
Zusätzlich werden die bereits begonnenen Chlor-/Schwimmbad-Grounding-Regeln erneut in den aktuellen Stand übernommen.
|
||||||
|
|
||||||
|
## Geänderte Dateien
|
||||||
|
|
||||||
|
- `src/Knowledge/Retrieval/NdjsonChunkLookup.php`
|
||||||
|
- `config/retriex/prompt.yaml`
|
||||||
|
|
||||||
|
## Details
|
||||||
|
|
||||||
|
### Exact document lookup
|
||||||
|
|
||||||
|
`NdjsonChunkLookup` erhält einen konservativen Fallback für Titelmatches mit fehlender numerischer Produktfamilie:
|
||||||
|
|
||||||
|
- `Testomat CAL` darf `Testomat 2000 CAL` matchen, wenn alle nicht-numerischen Titelanker passen.
|
||||||
|
- `Testomat 808 CAL` darf nicht allein wegen `CAL` auf `Testomat 2000 CAL` springen, wenn ein widersprechender Zahlenanker vorhanden ist.
|
||||||
|
|
||||||
|
Damit werden CAL-Fragen stärker auf das konkrete CAL-Dokument fokussiert, statt über breite semantische Treffer oder Testomat-808-nahe Indikatortabellen beantwortet zu werden.
|
||||||
|
|
||||||
|
### CAL-Grounding
|
||||||
|
|
||||||
|
Die Prompt-Regeln verbieten ausdrücklich:
|
||||||
|
|
||||||
|
- Übertragung von Testomat-808-Indikatortabellen auf Testomat CAL.
|
||||||
|
- Ausgabe des generischen Bereichs `0,02 °dH bis 5 °dH`, wenn dieser nicht explizit in einem CAL-Quellrecord belegt ist.
|
||||||
|
- Verwendung von 808-Indikatortypen wie `300`, `300 S`, `301`, `302`, `303`, `305`, `310`, `320`, `330`, `350` als CAL-Indikatoren.
|
||||||
|
|
||||||
|
Wenn keine CAL-spezifischen numerischen Grenzwerte belegt sind, muss das System das sagen, statt zu generalisieren.
|
||||||
|
|
||||||
|
### Chlor / Schwimmbad
|
||||||
|
|
||||||
|
Die Regeln unterscheiden weiterhin zwischen belegter Chlor-Messung und explizit belegter Schwimmbad-/Pool-Anwendung. Chlor-Messfähigkeit allein ist kein Beleg für Schwimmbad-Eignung.
|
||||||
|
|
||||||
|
## Erwartete Checks
|
||||||
|
|
||||||
|
Nach dem Einspielen ausführen:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
bin/console mto:agent:config:validate
|
||||||
|
bin/console mto:agent:regression:test
|
||||||
|
bin/console mto:agent:config:audit-source --details
|
||||||
|
bin/console mto:agent:config:audit-patterns --details
|
||||||
|
```
|
||||||
|
|
||||||
|
## Manuelle Regression
|
||||||
|
|
||||||
|
Erneut testen:
|
||||||
|
|
||||||
|
- `welche grenzwerte kann der testomat cal messen`
|
||||||
|
- `ich würde gern chlor im schwinnbad messen`
|
||||||
@@ -480,7 +480,7 @@ parameters:
|
|||||||
- '- Use retrieved knowledge as highest priority for technical matching, thresholds, measurement principles, and technical explanation when it contains a matching product or fact.'
|
- '- Use retrieved knowledge as highest priority for technical matching, thresholds, measurement principles, and technical explanation when it contains a matching product or fact.'
|
||||||
- '- If retrieved knowledge is silent or only contains unrelated products, but live shop results explicitly match the requested parameter/application, use the shop results and do not answer with a negative RAG-only conclusion.'
|
- '- If retrieved knowledge is silent or only contains unrelated products, but live shop results explicitly match the requested parameter/application, use the shop results and do not answer with a negative RAG-only conclusion.'
|
||||||
- '- If the user asks for Schwimmbad, Schwimmbecken, Pool, or typo-like pool wording, a product may only be recommended for that application when the same RAG or SHOP PRODUCT RECORD explicitly names that application. Chlor measurement alone is not proof of swimming-pool suitability.'
|
- '- If the user asks for Schwimmbad, Schwimmbecken, Pool, or typo-like pool wording, a product may only be recommended for that application when the same RAG or SHOP PRODUCT RECORD explicitly names that application. Chlor measurement alone is not proof of swimming-pool suitability.'
|
||||||
- '- If a product record proves Chlor measurement but not Schwimmbad, Schwimmbecken or Pool use, say exactly that distinction and avoid recommendation wording such as empfiehlt sich, geeignet für Schwimmbad, or Anwendung im Schwimmbad.'
|
- '- If a product record proves Chlor measurement but not Schwimmbad, Schwimmbecken or Pool use, say exactly that distinction and avoid recommendation wording such as empfiehlt sich, geeignet für Schwimmbad, Anwendung im Schwimmbad, or für Schwimmbäder.'
|
||||||
- '- For product-selection questions, a shop result proves technical suitability only when the same SHOP PRODUCT RECORD explicitly states the requested measurement parameter, application, or compatibility. Search ranking, generated query terms, generic category matches, and similar wording are not proof.'
|
- '- For product-selection questions, a shop result proves technical suitability only when the same SHOP PRODUCT RECORD explicitly states the requested measurement parameter, application, or compatibility. Search ranking, generated query terms, generic category matches, and similar wording are not proof.'
|
||||||
- '- If the requested parameter appears only in the generated shop query, metadata, unrelated highlights, or another product record, treat suitability as unverified and say that the shop hit requires technical verification.'
|
- '- If the requested parameter appears only in the generated shop query, metadata, unrelated highlights, or another product record, treat suitability as unverified and say that the shop hit requires technical verification.'
|
||||||
- '- Do not convert p-Wert, m-Wert, minus m-Wert, alkalinity, acid capacity, or other water-treatment parameters into pH or pH-Wert unless the same source explicitly says pH or pH-Wert.'
|
- '- Do not convert p-Wert, m-Wert, minus m-Wert, alkalinity, acid capacity, or other water-treatment parameters into pH or pH-Wert unless the same source explicitly says pH or pH-Wert.'
|
||||||
@@ -542,8 +542,9 @@ parameters:
|
|||||||
entry explicitly connects them.'
|
entry explicitly connects them.'
|
||||||
- '- If several devices or indicators are present, keep each device-indicator-range assignment separate and do not transfer an indicator from one product to
|
- '- If several devices or indicators are present, keep each device-indicator-range assignment separate and do not transfer an indicator from one product to
|
||||||
another.'
|
another.'
|
||||||
- '- For Testomat CAL or Testomat 2000 CAL threshold/range questions, do not answer with Testomat 808 indicator ranges or the generic 0,02 °dH to 5 °dH range unless a CAL source record explicitly contains that exact assignment.'
|
- '- For Testomat CAL or Testomat 2000 CAL threshold/range questions, use only source entries that explicitly name CAL or Testomat 2000 CAL in the same product record. Do not answer with Testomat 808 indicator ranges or the generic 0,02 °dH to 5 °dH range unless a CAL source record explicitly contains that exact assignment.'
|
||||||
- '- Do not use phrases such as typical monitoring range, typical range, or common range for a named product when the provided source only proves another product variant or does not explicitly state the named product range.'
|
- '- If the retrieved CAL-specific source records do not explicitly state numeric CAL threshold values, say that the exact CAL Grenzwerte are not belegbar from the provided sources instead of giving 0,02 °dH to 5 °dH as a typical range.'
|
||||||
|
- '- For Testomat CAL, do not transfer plain numeric 808 indicator types such as 300, 300 S, 301, 302, 303, 305, 310, 320, 330, or 350. CAL indicator statements must remain CAL-specific, for example TH-prefixed indicator codes when those are explicitly present in the CAL source.'
|
||||||
- '- If the source states only a threshold function, do not expand it into broader control logic.'
|
- '- If the source states only a threshold function, do not expand it into broader control logic.'
|
||||||
- '- If a detail is not explicitly stated in the provided sources, say so plainly.'
|
- '- If a detail is not explicitly stated in the provided sources, say so plainly.'
|
||||||
- '- Prefer short, source-close sentences over explanatory expansion.'
|
- '- Prefer short, source-close sentences over explanatory expansion.'
|
||||||
|
|||||||
@@ -272,6 +272,12 @@ final readonly class NdjsonChunkLookup
|
|||||||
/**
|
/**
|
||||||
* Allows prompts such as "Testomat CAL" to resolve a document titled
|
* Allows prompts such as "Testomat CAL" to resolve a document titled
|
||||||
* "Testomat 2000 CAL" without also allowing conflicting model numbers.
|
* "Testomat 2000 CAL" without also allowing conflicting model numbers.
|
||||||
|
*
|
||||||
|
* This is deliberately limited to titles with at least two non-numeric
|
||||||
|
* anchors and at least one numeric title token. A prompt with its own
|
||||||
|
* numeric token must match one of the title's numeric tokens; otherwise a
|
||||||
|
* user asking for e.g. "Testomat 808 CAL" could incorrectly resolve to
|
||||||
|
* "Testomat 2000 CAL".
|
||||||
*/
|
*/
|
||||||
private function isConfidentTitleTokenMatchAllowingMissingNumeric(string $normalizedPrompt, string $normalizedTitle): bool
|
private function isConfidentTitleTokenMatchAllowingMissingNumeric(string $normalizedPrompt, string $normalizedTitle): bool
|
||||||
{
|
{
|
||||||
|
|||||||
Reference in New Issue
Block a user