2.2 KiB
RetrieX Numeric Extreme Retrieval Fix
Purpose
This patch sharpens retrieval for direct numeric extreme questions such as the lowest hardness threshold.
The concrete regression was:
- User asks for the lowest water-hardness threshold monitored by a Testomat.
- The correct answer is
0,02 °dH/Testomat 808. - Retrieval still allowed neighbouring runner-up product context such as
Testomat 2000/0,05 °dHinto the prompt.
That made the model add unnecessary comparison details although the user asked only for the lowest value.
Change
src/Knowledge/Retrieval/NdjsonHybridRetriever.php now adds a conservative numeric-extreme document selection step between focused-product selection and normal dominant/spread selection.
The new mode:
- detects minimum/maximum-style technical measurement questions,
- extracts dH measurement values from the top retrieval window,
- identifies the document containing the actual extreme value,
- selects chunks from that document only,
- avoids filling the remaining prompt slots with runner-up product chunks.
New debug selection mode:
sales_numeric_extreme_document
Safety
The fix is intentionally narrow:
- no PromptBuilder changes,
- no prompt wording changes,
- no Shopware logic changes,
- no vector-service changes,
- no scoring config changes,
- no vocabulary changes.
It only affects technical numeric extreme questions containing measurement/context signals such as Grenzwert, Messbereich, Wasserhärte, Resthärte, dH, threshold, or range.
Expected regression result
Question:
Was ist der niedrigste Grenzwert für die Wasserhärte, welcher mit einem Testomaten überwacht werden kann?
Expected answer should stay focused on:
0,02 °dH / Testomat 808
It should not add the runner-up product/value such as:
Testomat 2000 / 0,05 °dH
unless the user explicitly asks for comparison, alternatives, or all available values.
After applying
Run:
php bin/console cache:clear
php bin/console mto:agent:config:validate
php bin/console mto:agent:regression:test
Then manually retest the known 1.4.2 baseline and the lowest-threshold prompt above.