patch 20f

This commit is contained in:
team 1
2026-05-02 20:52:38 +02:00
parent 446df191c0
commit dbfc079bde
7 changed files with 534 additions and 41 deletions

View File

@@ -0,0 +1,91 @@
# RetrieX Patch 20d - Commercial Table Follow-up Routing Fix
## Ziel
p20c hat referenzielle Tabellen-/Preisnachfragen noch zu spaet bzw. zu fragil behandelt. Der Fall
```text
welche grenzwerte kann der testomat 808 messen
die tabelle mit preisen
```
konnte weiterhin in den RAG-only-Pfad fallen, wenn der kurze Commerce-History-Kontext keinen passenden Anchor geliefert hat.
p20d sorgt dafuer, dass kommerzielle Tabellen-Follow-ups zuerst sicher als Shop-Intent geroutet werden. Die konkrete Shop-Query wird danach robuster aus erweitertem bzw. vollem Verlauf abgeleitet.
## Aenderungen
Geaendert:
- `src/Agent/AgentRunner.php`
## Technische Anpassungen
- `detectCommerceIntentForRouting()` stuft kommerzielle Tabellen-Follow-ups nun auch dann als `product_search` ein, wenn im kurzen History-Kontext kein Anchor gefunden wurde.
- Die Anchor-Pruefung bleibt im Log sichtbar (`hasHistoryAnchor`), blockiert aber nicht mehr das Shop-Routing.
- `resolveShopSearchQuery()` nutzt fuer kommerzielle Tabellen-Follow-ups einen dedizierten Resolver.
- Der Resolver prueft zuerst den vorhandenen Commerce-History-Kontext, dann erweiterten History-Kontext, dann optional den Full-History-Kontext.
- Die History wird newest-first durchsucht, damit auch nach einem fehlgeschlagenen Zwischen-Follow-up noch der vorherige fachliche Tabellenkontext gefunden werden kann.
- Wenn ein Turn zwar Indikator-/Reagenz-Kontext enthaelt, aber kein Modell, wird nicht sofort generisch `indikator` genommen. Es wird weiter nach einem juengeren/aelteren Turn mit Modellanker gesucht. Erst danach wird auf die generische Query zurueckgefallen.
## Bewusst nicht geaendert
- Keine neue Tippfehlerliste.
- Keine Scoring-Aenderung.
- Keine Retrieval-/Vector-Aenderung.
- Keine Aenderung an der LLM-Input-Normalisierung aus p20/p20b.
- Keine neuen YAML-Pfade.
## Erwartete Wirkung
Der Flow
```text
welche grenzwerte kann der testomat 808 messen
die tabelle mit preisen
```
soll eine Shop-Suche ausloesen. Erwartete Shop-Query sinngemaess:
```text
testomat 808 indikator
```
Auch
```text
die tabelle mit shop preisen
```
soll in den Shop-Pfad gehen.
## Pflichtchecks nach Einspielen
```bash
bin/console mto:agent:config:validate
bin/console mto:agent:regression:test
bin/console mto:agent:config:audit-source --details
bin/console mto:agent:config:audit-patterns --details
```
## Manuelle Regressionstests
```text
was kpstet der indikator
```
```text
ich suche eine preiswerte Loesung zur messung von pH & Chlor fuer mein schwimmbad
```
```text
welche grenzwerte kann der testomat 808 messen
die tabelle mit preisen
```
```text
welche grenzwerte kann der testomat 808 messen
die tabelle mit shop preisen
```

View File

@@ -0,0 +1,87 @@
# RetrieX Patch 20e - Force Commercial Table Follow-up Routing
## Ziel
Patch 20e behebt die weiterhin fehlschlagende referenzielle Shop-Nachfrage nach einer zuvor erzeugten Tabelle, z. B.:
```text
welche grenzwerte kann der testomat 808 messen
```
gefolgt von:
```text
die tabelle mit preisen
```
oder:
```text
die tabelle mit shop preisen
```
Diese Follow-ups duerfen nicht mehr in den RAG-only-Pfad fallen.
## Ursache
Die bisherigen p20b/p20c/p20d-Ansaetze waren noch zu fragil, weil die Shop-Promotion entweder zu spaet griff oder indirekt ueber den normalen Commerce-Intent- und History-Anchor-Pfad lief. Wenn diese Route nicht aktiv wurde, blieb der Status bei `Shop-Treffer: nicht angefragt`.
## Aenderungen
- Fuehrt in `AgentRunner` einen expliziten `forcedShopSearchQuery` ein.
- Kommerzielle Tabellen-Follow-ups werden vor der normalen Commerce-Intent-Entscheidung erkannt.
- Wenn ein passender Tabellen-/Preis-Follow-up erkannt wird, wird der Shop-Pfad erzwungen.
- Die konkrete Shop-Query wird aus der Conversation-History abgeleitet.
- Bei Testomat-808-Grenzwert-/Indikator-Tabellen wird bevorzugt eine Query wie `Testomat 808 indikator` erzeugt.
- Wenn kein stabiler Kontext ableitbar ist, bleibt die Anfrage dennoch im Shop-Pfad und nutzt den konfigurierten Fallback `indikator` statt RAG-only.
- Die LLM-Input-Normalisierung aus p20 bleibt erhalten.
- Es werden keine konkreten Tippfehlerlisten eingefuehrt.
- Keine Scoring-/Vector-/Retrieval-Aenderung.
## Geaenderte Dateien
- `src/Agent/AgentRunner.php`
- `src/Config/AgentRunnerConfig.php`
- `src/Config/RetriexEffectiveConfigProvider.php`
- `config/retriex/agent.yaml`
## Pflichtchecks nach Einspielen
```bash
bin/console mto:agent:config:validate
bin/console mto:agent:regression:test
bin/console mto:agent:config:audit-source --details
bin/console mto:agent:config:audit-patterns --details
```
## Manuelle Regressionstests
```text
was kpstet der indikator
```
Erwartung: Shop-Suche wird ausgeloest.
```text
ich suche eine preiswerte Lösung zur messung von pH & Chlor für mein schwimmbad
```
Erwartung: beratende Shop-/Produktsuche wird ausgeloest.
```text
welche grenzwerte kann der testomat 808 messen
```
gefolgt von:
```text
die tabelle mit preisen
```
Erwartung: Shop-Suche wird ausgeloest, Query sinngemaess `Testomat 808 indikator` oder mindestens `indikator`.
```text
die tabelle mit shop preisen
```
Erwartung: Shop-Suche wird ebenfalls ausgeloest, nicht RAG-only.

View File

@@ -0,0 +1,75 @@
# RetrieX Patch 20f - Explicit Shop Routing Guard
## Purpose
Patch 20f fixes a critical routing regression observed after the p20 series:
```text
shop testomat 808
```
This must always enter the Shop path. After p20e it could still fall back to RAG-only with `Shop-Treffer: nicht angefragt`.
## Root cause
The p20 series added LLM/fuzzy input normalization and several referential follow-up routing helpers. The follow-up fixes were too focused on table/price references and did not provide a hard safety net for explicit user routing terms such as `shop`, `preis`, `kostet`, `suche`, `kaufen`, etc.
If the normal commerce detector or a normalized routing prompt failed to preserve that signal, an explicit Shop request could still fall into the non-commerce/RAG-only branch.
## Change
`AgentRunner` now applies an explicit Shop routing guard before the normal Commerce intent decision:
- Checks the original user prompt and the routing prompt for configured commercial signal terms from YAML.
- If the original prompt explicitly says `shop`/price/search/etc., the routing prompt is restored to the original prompt unless this is a commercial table follow-up.
- Forces `CommerceIntentLite::PRODUCT_SEARCH` for explicit commercial routing signals.
- Keeps the existing p20 LLM input normalization and p20e commercial table follow-up handling.
- Adds logging for forced explicit Shop routing.
## Files changed
- `src/Agent/AgentRunner.php`
## Important design note
This is not a new typo list and not a scoring change. It is a routing invariant:
> Explicit Shop/commercial wording must never result in `Shop-Treffer: nicht angefragt`.
The signal terms are still read from existing YAML config (`follow_up_context.explicit_commercial_signal_terms`).
## Recommended checks
```bash
bin/console mto:agent:config:validate
bin/console mto:agent:regression:test
bin/console mto:agent:config:audit-source --details
bin/console mto:agent:config:audit-patterns --details
```
## Manual regression tests
```text
shop testomat 808
```
Expected: Shop search is requested.
```text
was kpstet der indikator
```
Expected: typo normalization/fuzzy routing still turns this into a price/shop follow-up.
```text
welche grenzwerte kann der testomat 808 messen
die tabelle mit preisen
```
Expected: table-price follow-up is routed to Shop using the previous Testomat 808 / indicator context.
```text
ich suche eine preiswerte Lösung zur messung von pH & Chlor für mein schwimmbad
```
Expected: advisory product/shop routing is requested.

View File

@@ -164,6 +164,23 @@ parameters:
history_anchor_patterns:
- '/\bTestomat(?:®)?\s+\d{3,4}\b/iu'
- '/\b(?:Indikatortyp|Indikator|Indikatoren|Reagenz|Reagenzien|Zubehör|Zubehoer)\b/iu'
table_terms:
- tabelle
- tabellarisch
- übersicht
- uebersicht
- liste
- auflistung
commercial_terms:
- preis
- preise
- preisen
- kosten
- kostet
- shop
- shoppreis
- shoppreise
- shopdaten
indicator_marker_patterns:
- '/\b(?:Indikatortyp|Indikator(?:en)?|indicator(?:\s+type)?|Reagenz(?:ien)?)\b/iu'
query_template_with_model: '{model} indikator'

View File

@@ -58,11 +58,13 @@ final readonly class AgentRunner
$sources = [];
$optimizedShopQuery = '';
$shopSearchQuery = '';
$forcedShopSearchQuery = '';
$commerceHistoryContext = '';
$attemptedShopRepair = false;
$usedShopRepair = false;
$shopRepairQueries = [];
$shopSearchAttempted = false;
$explicitShopRoutingForced = false;
$primaryShopSearchHadSystemFailure = false;
$historyNotices = [];
@@ -107,12 +109,56 @@ final readonly class AgentRunner
$this->addSource($sources, $this->agentRunnerConfig->getExternalUrlSourceLabel());
}
$commerceIntent = $this->detectCommerceIntentForRouting(
$originalPromptHasExplicitShopSignal = $this->containsExplicitShopRoutingSignal($originalPrompt);
$routingPromptHasExplicitShopSignal = $this->containsExplicitShopRoutingSignal($routingPrompt);
if (
$originalPromptHasExplicitShopSignal
&& !$this->isCommercialTableFollowUpPrompt($routingPrompt)
) {
// Explicit user routing terms such as "shop" must never be lost
// through LLM normalization before the commerce gate is evaluated.
$routingPrompt = $originalPrompt;
$routingPromptHasExplicitShopSignal = true;
}
$forcedShopSearchQuery = $this->resolveForcedCommercialFollowUpShopQuery(
$routingPrompt,
$userId,
$requestContextHint
);
$explicitShopRoutingForced = $forcedShopSearchQuery === ''
&& ($originalPromptHasExplicitShopSignal || $routingPromptHasExplicitShopSignal);
$commerceIntent = ($forcedShopSearchQuery !== '' || $explicitShopRoutingForced)
? CommerceIntentLite::PRODUCT_SEARCH
: $this->detectCommerceIntentForRouting(
$routingPrompt,
$userId,
$requestContextHint
);
if ($forcedShopSearchQuery !== '') {
$this->agentLogger->info('Forced commercial follow-up into shop routing', [
'userId' => $userId,
'prompt' => $prompt,
'routingPrompt' => $routingPrompt,
'forcedShopSearchQuery' => $forcedShopSearchQuery,
'hasRequestContextHint' => trim($requestContextHint) !== '',
]);
}
if ($explicitShopRoutingForced) {
$this->agentLogger->info('Forced explicit shop signal into commerce routing', [
'userId' => $userId,
'prompt' => $prompt,
'routingPrompt' => $routingPrompt,
'originalPromptHasExplicitShopSignal' => $originalPromptHasExplicitShopSignal,
'routingPromptHasExplicitShopSignal' => $routingPromptHasExplicitShopSignal,
]);
}
yield $this->systemMsg($this->agentRunnerConfig->getRetrieveKnowledgeMessage(), 'think');
$knowledgeRetrievalPrompt = $this->buildKnowledgeRetrievalPrompt(
@@ -171,6 +217,10 @@ final readonly class AgentRunner
$this->addSource($sources, $this->agentRunnerConfig->getConversationHistorySourceLabel());
}
if ($forcedShopSearchQuery !== '') {
$optimizedShopQuery = '';
$shopSearchQuery = $forcedShopSearchQuery;
} else {
$optimizedShopQuery = yield from $this->buildOptimizedShopQuery(
$routingPrompt,
$userId,
@@ -183,6 +233,7 @@ final readonly class AgentRunner
commerceHistoryContext: $commerceHistoryContext,
userId: $userId
);
}
if ($shopSearchQuery === '') {
$this->agentLogger->info('Commerce search skipped because no concrete shop query could be resolved', [
@@ -937,6 +988,76 @@ final readonly class AgentRunner
return (string) ($commerceMeta['intent'] ?? CommerceIntentLite::NONE);
}
private function containsExplicitShopRoutingSignal(string $prompt): bool
{
$normalized = $this->normalizeFollowUpText($prompt);
if ($normalized === '') {
return false;
}
foreach ($this->agentRunnerConfig->getFollowUpExplicitCommercialSignalTerms() as $signal) {
$signal = $this->normalizeFollowUpText($signal);
if ($signal === '') {
continue;
}
$pattern = '/(?<![\p{L}\p{N}])' . preg_quote($signal, '/') . '(?![\p{L}\p{N}])/u';
if (preg_match($pattern, $normalized) === 1) {
return true;
}
}
return false;
}
private function resolveForcedCommercialFollowUpShopQuery(
string $prompt,
string $userId,
string $requestContextHint
): string {
if (!$this->isCommercialTableFollowUpPrompt($prompt)) {
return '';
}
$commerceHistoryContext = $this->buildCommerceHistoryContext($userId, $requestContextHint);
$query = $this->resolveCommercialTableFollowUpShopQuery($commerceHistoryContext, $userId);
if ($query !== '' && !$this->isMetaOnlyShopQuery($query)) {
return $query;
}
$contextQuery = $this->extractContextualShopSearchQuery($commerceHistoryContext);
if ($contextQuery !== '' && !$this->isMetaOnlyShopQuery($contextQuery)) {
return $contextQuery;
}
$extendedHistoryBudget = $this->agentRunnerConfig->getShopQueryContextFallbackHistoryBudgetChars();
if ($extendedHistoryBudget > mb_strlen($commerceHistoryContext, 'UTF-8')) {
$extendedHistory = $this->contextService->buildUserContextWithinBudget($userId, $extendedHistoryBudget);
$contextQuery = $this->extractContextualShopSearchQuery($extendedHistory);
if ($contextQuery !== '' && !$this->isMetaOnlyShopQuery($contextQuery)) {
return $contextQuery;
}
}
if ($this->agentRunnerConfig->shouldUseFullHistoryForShopQueryContextFallback()) {
$fullHistory = $this->contextService->buildUserContext($userId, true);
$contextQuery = $this->extractContextualShopSearchQuery($fullHistory);
if ($contextQuery !== '' && !$this->isMetaOnlyShopQuery($contextQuery)) {
return $contextQuery;
}
}
// Last-resort fallback for explicit commercial table follow-ups.
// This keeps the request in the shop path instead of falling back to RAG-only.
return trim($this->agentRunnerConfig->getCommercialTableFollowUpQueryTemplateWithoutModel());
}
private function detectCommerceIntentForRouting(
string $prompt,
string $userId,
@@ -954,13 +1075,10 @@ final readonly class AgentRunner
$commerceHistoryContext = $this->buildCommerceHistoryContext($userId, $requestContextHint);
if (!$this->commercialTableFollowUpHistoryHasAnchor($commerceHistoryContext)) {
return $commerceIntent;
}
$this->agentLogger->info('Promoted commercial table follow-up to shop intent', [
'userId' => $userId,
'prompt' => $prompt,
'hasHistoryAnchor' => $this->commercialTableFollowUpHistoryHasAnchor($commerceHistoryContext),
'hasRequestContextHint' => trim($requestContextHint) !== '',
]);
@@ -1149,6 +1267,31 @@ final readonly class AgentRunner
return (string) end($turns);
}
/**
* @return string[]
*/
private function extractHistoryTurnsNewestFirst(string $history): array
{
$history = trim($history);
if ($history === '') {
return [];
}
$parts = preg_split($this->agentRunnerConfig->getFollowUpHistoryTurnSplitPattern(), $history);
if ($parts === false || $parts === []) {
return [$history];
}
$turns = array_values(array_filter(
array_map(static fn(string $part): string => trim($part), $parts),
static fn(string $part): bool => $part !== ''
));
return array_reverse($turns);
}
private function extractFirstTestomatModelAnchor(string $text): string
{
if (preg_match($this->agentRunnerConfig->getFollowUpReferenceAnchorTestomatModelPattern(), $text, $matches) !== 1) {
@@ -1300,7 +1443,10 @@ final readonly class AgentRunner
string $userId
): string {
if ($this->isCommercialTableFollowUpPrompt($prompt)) {
$commercialTableContextQuery = $this->extractCommercialTableFollowUpShopQuery($commerceHistoryContext);
$commercialTableContextQuery = $this->resolveCommercialTableFollowUpShopQuery(
$commerceHistoryContext,
$userId
);
if ($commercialTableContextQuery !== '' && !$this->isMetaOnlyShopQuery($commercialTableContextQuery)) {
return $commercialTableContextQuery;
@@ -1344,25 +1490,51 @@ final readonly class AgentRunner
return '';
}
private function resolveCommercialTableFollowUpShopQuery(string $commerceHistoryContext, string $userId): string
{
$query = $this->extractCommercialTableFollowUpShopQuery($commerceHistoryContext);
if ($query !== '') {
return $query;
}
$extendedHistoryBudget = $this->agentRunnerConfig->getShopQueryContextFallbackHistoryBudgetChars();
if ($extendedHistoryBudget > mb_strlen($commerceHistoryContext, 'UTF-8')) {
$extendedHistory = $this->contextService->buildUserContextWithinBudget($userId, $extendedHistoryBudget);
$query = $this->extractCommercialTableFollowUpShopQuery($extendedHistory);
if ($query !== '') {
return $query;
}
}
if ($this->agentRunnerConfig->shouldUseFullHistoryForShopQueryContextFallback()) {
return $this->extractCommercialTableFollowUpShopQuery(
$this->contextService->buildUserContext($userId, true)
);
}
return '';
}
private function extractCommercialTableFollowUpShopQuery(string $commerceHistoryContext): string
{
if (!$this->agentRunnerConfig->isCommercialTableFollowUpEnabled()) {
return '';
}
$turn = $this->extractLatestHistoryTurn($commerceHistoryContext);
if ($turn === '') {
return '';
}
$hasIndicatorContext = false;
foreach ($this->extractHistoryTurnsNewestFirst($commerceHistoryContext) as $turn) {
if (!$this->matchesAnyConfiguredPattern(
$turn,
$this->agentRunnerConfig->getCommercialTableFollowUpIndicatorMarkerPatterns()
)) {
return '';
continue;
}
$hasIndicatorContext = true;
$model = $this->extractFirstTestomatModelAnchor($turn);
if ($model !== '') {
@@ -1374,20 +1546,51 @@ final readonly class AgentRunner
return trim((string) preg_replace('/\s+/u', ' ', $query));
}
}
if ($hasIndicatorContext) {
return trim($this->agentRunnerConfig->getCommercialTableFollowUpQueryTemplateWithoutModel());
}
return '';
}
private function isCommercialTableFollowUpPrompt(string $prompt): bool
{
if (!$this->agentRunnerConfig->isCommercialTableFollowUpEnabled()) {
return false;
}
return $this->matchesAnyConfiguredPattern(
$this->normalizeFollowUpText($prompt),
$normalized = $this->normalizeFollowUpText($prompt);
if ($this->matchesAnyConfiguredPattern(
$normalized,
$this->agentRunnerConfig->getCommercialTableFollowUpPromptPatterns()
);
)) {
return true;
}
$tokens = $this->tokenizeMetaGuardText($normalized);
if ($tokens === []) {
return false;
}
$hasTableReference = count(array_intersect(
$tokens,
$this->agentRunnerConfig->getCommercialTableFollowUpTableTerms()
)) > 0;
if (!$hasTableReference) {
return false;
}
foreach ($this->agentRunnerConfig->getCommercialTableFollowUpCommercialTerms() as $term) {
if (in_array($this->normalizeFollowUpText($term), $tokens, true)) {
return true;
}
}
return false;
}
private function commercialTableFollowUpHistoryHasAnchor(string $commerceHistoryContext): bool

View File

@@ -76,6 +76,22 @@ final class AgentRunnerConfig
return $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.history_anchor_patterns');
}
/**
* @return string[]
*/
public function getCommercialTableFollowUpTableTerms(): array
{
return $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.table_terms');
}
/**
* @return string[]
*/
public function getCommercialTableFollowUpCommercialTerms(): array
{
return $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.commercial_terms');
}
/**
* @return string[]
*/

View File

@@ -446,6 +446,8 @@ final readonly class RetriexEffectiveConfigProvider
'enabled' => $this->agentRunnerConfig->isCommercialTableFollowUpEnabled(),
'prompt_patterns' => $this->agentRunnerConfig->getCommercialTableFollowUpPromptPatterns(),
'history_anchor_patterns' => $this->agentRunnerConfig->getCommercialTableFollowUpHistoryAnchorPatterns(),
'table_terms' => $this->agentRunnerConfig->getCommercialTableFollowUpTableTerms(),
'commercial_terms' => $this->agentRunnerConfig->getCommercialTableFollowUpCommercialTerms(),
'indicator_marker_patterns' => $this->agentRunnerConfig->getCommercialTableFollowUpIndicatorMarkerPatterns(),
'query_template_with_model' => $this->agentRunnerConfig->getCommercialTableFollowUpQueryTemplateWithModel(),
'query_template_without_model' => $this->agentRunnerConfig->getCommercialTableFollowUpQueryTemplateWithoutModel(),
@@ -1058,6 +1060,8 @@ final readonly class RetriexEffectiveConfigProvider
$commercialTableFollowUp = is_array($followUpContext['commercial_table_follow_up'] ?? null) ? $followUpContext['commercial_table_follow_up'] : [];
$this->validateRegexPatternList($commercialTableFollowUp['prompt_patterns'] ?? [], 'agent.follow_up_context.commercial_table_follow_up.prompt_patterns', $errors);
$this->validateRegexPatternList($commercialTableFollowUp['history_anchor_patterns'] ?? [], 'agent.follow_up_context.commercial_table_follow_up.history_anchor_patterns', $errors);
$this->validateStringList($this->toList($commercialTableFollowUp['table_terms'] ?? []), 'agent.follow_up_context.commercial_table_follow_up.table_terms', $errors, $warnings);
$this->validateStringList($this->toList($commercialTableFollowUp['commercial_terms'] ?? []), 'agent.follow_up_context.commercial_table_follow_up.commercial_terms', $errors, $warnings);
$this->validateRegexPatternList($commercialTableFollowUp['indicator_marker_patterns'] ?? [], 'agent.follow_up_context.commercial_table_follow_up.indicator_marker_patterns', $errors);
if (trim((string) ($commercialTableFollowUp['query_template_with_model'] ?? '')) === '') {
$errors[] = 'agent.follow_up_context.commercial_table_follow_up.query_template_with_model must not be empty.';