harden token config

This commit is contained in:
team 1
2026-04-26 15:20:06 +02:00
parent f34bc6b988
commit e3fd4541e4
6 changed files with 171 additions and 4 deletions

View File

@@ -0,0 +1,78 @@
# RetrieX Numeric Extreme Retrieval Fix
## Purpose
This patch sharpens retrieval for direct numeric extreme questions such as the lowest hardness threshold.
The concrete regression was:
- User asks for the lowest water-hardness threshold monitored by a Testomat.
- The correct answer is `0,02 °dH` / `Testomat 808`.
- Retrieval still allowed neighbouring runner-up product context such as `Testomat 2000` / `0,05 °dH` into the prompt.
That made the model add unnecessary comparison details although the user asked only for the lowest value.
## Change
`src/Knowledge/Retrieval/NdjsonHybridRetriever.php` now adds a conservative numeric-extreme document selection step between focused-product selection and normal dominant/spread selection.
The new mode:
- detects minimum/maximum-style technical measurement questions,
- extracts dH measurement values from the top retrieval window,
- identifies the document containing the actual extreme value,
- selects chunks from that document only,
- avoids filling the remaining prompt slots with runner-up product chunks.
New debug selection mode:
```text
sales_numeric_extreme_document
```
## Safety
The fix is intentionally narrow:
- no PromptBuilder changes,
- no prompt wording changes,
- no Shopware logic changes,
- no vector-service changes,
- no scoring config changes,
- no vocabulary changes.
It only affects technical numeric extreme questions containing measurement/context signals such as `Grenzwert`, `Messbereich`, `Wasserhärte`, `Resthärte`, `dH`, `threshold`, or `range`.
## Expected regression result
Question:
```text
Was ist der niedrigste Grenzwert für die Wasserhärte, welcher mit einem Testomaten überwacht werden kann?
```
Expected answer should stay focused on:
```text
0,02 °dH / Testomat 808
```
It should not add the runner-up product/value such as:
```text
Testomat 2000 / 0,05 °dH
```
unless the user explicitly asks for comparison, alternatives, or all available values.
## After applying
Run:
```bash
php bin/console cache:clear
php bin/console mto:agent:config:validate
php bin/console mto:agent:regression:test
```
Then manually retest the known 1.4.2 baseline and the lowest-threshold prompt above.

View File

@@ -0,0 +1,47 @@
# RetrieX Prompt Precision Priority Fix
This patch intentionally avoids retrieval changes.
It does not add hard-coded keyword lists to the retriever and does not introduce a new core retrieval special case.
## Goal
For technical questions where the retrieved chunk already contains the exact answer plus nearby comparison values, the model should answer the requested fact first and avoid adding runner-up products or adjacent values unless the user explicitly asks for comparison or alternatives.
Example:
- Question: `Was ist der niedrigste Grenzwert fuer die Wasserhaerte, welcher mit einem Testomaten ueberwacht werden kann?`
- Retrieved chunk may contain both `Testomat 808: 0,02 °dH` and `Testomat 2000: 0,05 °dH`.
- Expected answer should focus on `0,02 °dH / Testomat 808` and should not add the runner-up value unless requested.
## Changed files
- `config/retriex/prompt.yaml`
- `src/Agent/PromptBuilder.php`
- `src/Config/PromptBuilderConfig.php`
- `src/Config/RetriexEffectiveConfigProvider.php`
## What changed
- Added configurable `output_priority.technical_rules` in `prompt.yaml`.
- `PromptBuilder` now emits the `OUTPUT PRIORITY` block for technical questions even when no shop results are present.
- The technical priority rules are loaded through `PromptBuilderConfig` with PHP fallback defaults.
- The effective config/regression provider now exposes and guards the technical output-priority rules.
## Not changed
- No retrieval logic changed.
- No vector search logic changed.
- No shop logic changed.
- No core hard-coded domain keyword list added.
- No scoring behavior changed.
## After applying
Run:
```bash
php bin/console cache:clear
php bin/console mto:agent:config:validate
php bin/console mto:agent:regression:test
```

View File

@@ -65,6 +65,11 @@ parameters:
- '- Use retrieved knowledge first to determine the technically matching product or answer.' - '- Use retrieved knowledge first to determine the technically matching product or answer.'
- '- If shop results are present, use them afterwards to add current price, availability, and the actual URL.' - '- If shop results are present, use them afterwards to add current price, availability, and the actual URL.'
- '- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.' - '- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.'
technical_rules:
- '- For technical questions, answer the exact requested fact first and keep it as the main answer.'
- '- If one source chunk contains both the best matching value and nearby comparison values, use the nearby values only as context and do not include them unless the user asks for comparison or alternatives.'
- '- For lowest/highest/minimum/maximum questions, answer only the requested extreme value and the product/device explicitly connected to it.'
- '- Do not add runner-up products, second-lowest values, adjacent ranges, broader tables, or explanatory comparisons unless explicitly requested.'
response_format: response_format:
base_rules: base_rules:
- '- Keep normal spacing between all words. Never fuse words together.' - '- Keep normal spacing between all words. Never fuse words together.'

View File

@@ -52,7 +52,10 @@ final readonly class PromptBuilder
$systemBlock = $this->buildSystemBlock(); $systemBlock = $this->buildSystemBlock();
$shopBlock = $this->buildShopBlock($shopResults, $swagFullOutPut); $shopBlock = $this->buildShopBlock($shopResults, $swagFullOutPut);
$outputPriorityBlock = $this->buildOutputPriorityBlock($hasShopResults); $outputPriorityBlock = $this->buildOutputPriorityBlock(
hasShopResults: $hasShopResults,
isTechnicalProductQuestion: $isTechnicalProductQuestion
);
$responseFormatBlock = $this->buildResponseFormatBlock( $responseFormatBlock = $this->buildResponseFormatBlock(
hasShopResults: $hasShopResults, hasShopResults: $hasShopResults,
isTechnicalProductQuestion: $isTechnicalProductQuestion, isTechnicalProductQuestion: $isTechnicalProductQuestion,
@@ -214,15 +217,25 @@ final readonly class PromptBuilder
/** /**
* Build a small priority block that tells the model what to surface first. * Build a small priority block that tells the model what to surface first.
*/ */
private function buildOutputPriorityBlock(bool $hasShopResults): string private function buildOutputPriorityBlock(bool $hasShopResults, bool $isTechnicalProductQuestion): string
{ {
if (!$hasShopResults) { $rules = [];
if ($isTechnicalProductQuestion) {
$rules = array_merge($rules, $this->config->getOutputPriorityTechnicalRules());
}
if ($hasShopResults) {
$rules = array_merge($rules, $this->config->getOutputPriorityRules());
}
if ($rules === []) {
return ''; return '';
} }
return $this->buildRuleBlock( return $this->buildRuleBlock(
$this->config->getOutputPrioritySectionLabel(), $this->config->getOutputPrioritySectionLabel(),
$this->config->getOutputPriorityRules() $rules
); );
} }

View File

@@ -295,6 +295,18 @@ final class PromptBuilderConfig
'- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.', '- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.',
]); ]);
} }
/**
* @return string[]
*/
public function getOutputPriorityTechnicalRules(): array
{
return $this->getStringList('output_priority.technical_rules', [
'- For technical questions, answer the exact requested fact first and keep it as the main answer.',
'- If one source chunk contains both the best matching value and nearby comparison values, use the nearby values only as context and do not include them unless the user asks for comparison or alternatives.',
'- For lowest/highest/minimum/maximum questions, answer only the requested extreme value and the product/device explicitly connected to it.',
'- Do not add runner-up products, second-lowest values, adjacent ranges, broader tables, or explanatory comparisons unless explicitly requested.',
]);
}
public function getResponseFormatSectionLabel(): string public function getResponseFormatSectionLabel(): string
{ {

View File

@@ -140,6 +140,17 @@ final readonly class RetriexEffectiveConfigProvider
$errors[] = 'Missing technical prompt keyword: ' . $term; $errors[] = 'Missing technical prompt keyword: ' . $term;
} }
} }
$technicalPriorityRules = implode("\n", $this->promptConfig->getOutputPriorityTechnicalRules());
$checks['technical_priority_rules_present'] = trim($technicalPriorityRules) !== '';
$checks['technical_priority_prevents_runner_up'] = str_contains($technicalPriorityRules, 'runner-up')
|| str_contains($technicalPriorityRules, 'second-lowest')
|| str_contains($technicalPriorityRules, 'comparison');
if (!$checks['technical_priority_rules_present']) {
$errors[] = 'Missing technical output priority rules.';
}
if (!$checks['technical_priority_prevents_runner_up']) {
$errors[] = 'Technical output priority no longer guards against runner-up/comparison expansion.';
}
$accessoryKeywords = $this->promptConfig->getAccessoryRequestKeywords(); $accessoryKeywords = $this->promptConfig->getAccessoryRequestKeywords();
foreach (['indikator', 'reagenz'] as $term) { foreach (['indikator', 'reagenz'] as $term) {
@@ -304,6 +315,7 @@ final readonly class RetriexEffectiveConfigProvider
'conversation_context_intro_lines' => $this->promptConfig->getConversationContextIntroLines(), 'conversation_context_intro_lines' => $this->promptConfig->getConversationContextIntroLines(),
'live_shop_results_header_lines' => $this->promptConfig->getLiveShopResultsHeaderLines(), 'live_shop_results_header_lines' => $this->promptConfig->getLiveShopResultsHeaderLines(),
'output_priority' => $this->promptConfig->getOutputPriorityRules(), 'output_priority' => $this->promptConfig->getOutputPriorityRules(),
'output_priority_technical' => $this->promptConfig->getOutputPriorityTechnicalRules(),
'response_format_base' => $this->promptConfig->getResponseFormatBaseRules(), 'response_format_base' => $this->promptConfig->getResponseFormatBaseRules(),
'response_format_with_shop' => $this->promptConfig->getResponseFormatWithShopRules(), 'response_format_with_shop' => $this->promptConfig->getResponseFormatWithShopRules(),
'response_format_without_shop' => $this->promptConfig->getResponseFormatWithoutShopRules(), 'response_format_without_shop' => $this->promptConfig->getResponseFormatWithoutShopRules(),