p59 + p60

This commit is contained in:
team 1
2026-05-07 07:52:52 +02:00
parent 56646a0c3b
commit 87c2134e6c
20 changed files with 808 additions and 1256 deletions

View File

@@ -0,0 +1,134 @@
# RetrieX Patch p59 - Complete Genre Wiring
## Ziel
Dieser Patch schliesst die Genre-Verdrahtung weiter ab: noch nicht oder nur indirekt verdrahtete fachliche Parameter lesen nun bevorzugt aus `config/retriex/genre.yaml`.
Wichtig: p59 loescht keine Werte aus Legacy-YAMLs. Die alten Pfade bleiben als Fallback aktiv. Damit ist der Patch bewusst ein reiner Wiring-Schritt und bereitet p60 vor.
## Warum dieser Schritt
p55 bis p58 haben die zentrale Single-Genre-Pflegeflaeche aufgebaut, erste Runtime-Flows an `genre.yaml` angeschlossen und bereits verdrahtete Legacy-Werte kontrolliert reduziert.
p59 erweitert diese bevorzugte Genre-Leselogik auf die verbliebenen fachlichen Oberflaechen, insbesondere:
- `vocabulary.yaml` Include-/View-/Map-Ketten
- `prompt.yaml` fachliche Antwort-, Format- und Grounding-Regeln
- `search_repair.yaml` fachliche Suchreparatur-Patterns und Candidate-Terms
- `retrieval.yaml` Exact-Selection- und fachliche Retrieval-Begriffe
- `language.yaml` `protected_terms`
- `governance.yaml` fachliche Regression-/Guardrail-Erwartungen
- fachliche Intent-/Routing-Restwerte fuer Commerce/Sales
- Commercial-Table-Follow-up-Anker in `agent.yaml`
## Was geaendert wurde
### Genre-Konfiguration
`config/retriex/genre.yaml` wurde um fehlende, bereits bestehende fachliche Werte ergaenzt. Die Werte wurden aus den bisherigen Legacy-Pfaden uebernommen, nicht neu erfunden.
Ergaenzt wurden insbesondere:
- Shop-Semantic-Search-Tokens aus `vocabulary.views.shop.semantic_search_tokens`
- technische Prompt-Keywords aus `vocabulary.views.prompt.technical_product_keywords`
- Measurement-Evidence-Maps aus `vocabulary.maps.prompt.measurement_evidence_guard`
- Safety-Doc-/Safety-Word-Retrieval-Views aus `vocabulary.views.retrieval.*`
### Runtime-/Config-Facades
Folgende Config-Klassen lesen neue fachliche Werte bevorzugt aus `GenreConfig` und fallen sonst auf die bestehenden Legacy-Pfade zurueck:
- `DomainVocabularyConfig`
- `PromptBuilderConfig`
- `SearchRepairConfig`
- `NdjsonHybridRetrieverConfig`
- `LanguageCleanupConfig`
- `GovernanceConfig`
- `SalesIntentConfig`
- `CommerceIntentConfig`
- `AgentRunnerConfig`
### Dependency Injection
`config/services.yaml` injiziert `GenreConfig` nun auch in die neu verdrahteten Config-Facades.
## Nicht geaendert
- Keine Legacy-Wertloeschung.
- Keine Reduktion von `vocabulary.yaml`, `prompt.yaml`, `search_repair.yaml`, `retrieval.yaml`, `language.yaml` oder `governance.yaml`.
- Keine neuen fachlichen Defaults im PHP-Core.
- Keine neuen harten Token-/Stringlisten im PHP-Core.
- Keine Multi-Genre-/Tenant-Umschaltung.
- Keine Aenderung an `model.yaml`, `vector.yaml`, `index.yaml`, `runtime.yaml`, Cache-/Pfad-Konfiguration, LLM-Timeouts, Embedding-Modellen, Chunk-Groessen, technischen Stream Guards oder allgemeinem `final_answer_guard`.
- Keine Shopware-Kriterien-, Ranking-, Retrieval-Scoring- oder LLM-Logik-Aenderung.
## Erwartetes Verhalten
Das fachliche Antwortverhalten soll unveraendert bleiben. Fuer die verdrahteten Werte gilt nun:
1. `genre.yaml` ist bevorzugte Quelle.
2. Die bisherigen Legacy-Pfade bleiben Fallback.
3. p60 kann danach die verbleibenden doppelten Legacy-Werte sicher reduzieren.
Damit bleibt das Zielbild sauber getrennt:
- `genre.yaml` sagt, was fachlich relevant ist.
- Agent/Commerce/Prompt/Retrieval/Search-Repair/Language/Governance sagen, wie diese Werte verarbeitet, validiert oder auditiert werden.
## Lokale Pruefung
Durchgefuehrt:
```bash
php -l src/Config/DomainVocabularyConfig.php
php -l src/Config/PromptBuilderConfig.php
php -l src/Config/SearchRepairConfig.php
php -l src/Config/NdjsonHybridRetrieverConfig.php
php -l src/Config/LanguageCleanupConfig.php
php -l src/Config/GovernanceConfig.php
php -l src/Config/SalesIntentConfig.php
php -l src/Config/CommerceIntentConfig.php
php -l src/Config/AgentRunnerConfig.php
```
Alle PHP-Syntaxchecks waren gruen.
```bash
python3 - <<'PY'
# YAML-Parsing fuer alle Dateien unter config/retriex/*.yaml
PY
```
Alle `config/retriex/*.yaml` Dateien konnten geparst werden.
Zusaetzlich wurde skriptbasiert geprueft, dass die neu verdrahteten Genre-Werte den bisherigen Legacy-Werten entsprechen. Geprueft wurden unter anderem:
- Vocabulary-Views und Vocabulary-Maps
- Prompt-Regeln und technische Keywords
- Search-Repair-Patterns und Candidate-Terms
- Retrieval-Safety-Views und Language-Protected-Terms
- Governance-/Regression-Guardrail-Werte
- Sales-/Commerce-Intent-Restwerte
- Commercial-Table-Follow-up-Anker
Ergebnis: `RESULT PASS`.
`bin/console` konnte lokal nicht ausgefuehrt werden, weil im ZIP kein `vendor/` enthalten ist:
```text
Dependencies are missing. Try running "composer install".
```
## Projektchecks nach dem Einspielen
```bash
bin/console cache:clear
bin/console mto:agent:config:validate
bin/console mto:agent:regression:test
bin/console mto:agent:config:audit-source --details
bin/console mto:agent:config:audit-patterns --details
```
## Naechster Schritt
Wenn p59 gruen ist, kann p60 folgen: `Legacy Value Reduction Final`. Erst dort werden die verbleibenden doppelten fachlichen Werte aus den Legacy-YAMLs entfernt oder auf leere Fallback-Platzhalter reduziert.

View File

@@ -0,0 +1,133 @@
# RetrieX Patch p60 - Legacy Value Reduction Final
## Ziel
p60 reduziert die nach p59 noch doppelt gepflegten fachlichen Legacy-Werte in den verbleibenden YAML-Dateien.
Die fachlichen Werte bleiben in `config/retriex/genre.yaml` erhalten und werden durch p59 bereits bevorzugt gelesen. Die alten Legacy-Pfade bleiben als Fallback-Struktur bestehen, enthalten aber nur noch leere Platzhalter oder neutrale No-Match-Regex-Platzhalter.
## Wichtig
- Keine PHP-Runtime-Logik geändert.
- Keine neuen harten Token-/Stringlisten im PHP-Core.
- `genre.yaml` bleibt unverändert und enthält weiterhin die fachlichen Werte.
- Technische Verarbeitungsschichten bleiben bestehen.
- Technische Konfigurationen wie Runtime, Model, Vector, Index, Cache, Chunking, Stream Guards und allgemeine Final-Answer-Guards wurden nicht verschoben.
## Geänderte Dateien
- `config/retriex/vocabulary.yaml`
- `config/retriex/search_repair.yaml`
- `config/retriex/retrieval.yaml`
- `config/retriex/language.yaml`
- `config/retriex/prompt.yaml`
- `config/retriex/governance.yaml`
## Reduzierte Bereiche
### vocabulary.yaml
Die durch `DomainVocabularyConfig` bereits aus `genre.yaml` gelesenen Klassen, Views und Maps wurden auf leere Fallbacks reduziert, unter anderem:
- Produktrollen: Device-/Accessory-/Requested-Accessory-Code-Klassen
- Shop-Views: Device-/Accessory-Query, Product, Focus und Semantic-Search-Tokens
- Prompt-Views: Main-Device-/Accessory-/Technical-Product-Keywords
- Search-Repair-Views: Direct-Attribute-, Candidate-, Accessory- und Specificity-Listen
- Retrieval-Views: Generic Product Tokens, Short Model Tokens, Reagent/Safety/Device/Document Looks-Like-Listen
- Measurement-Evidence-Maps und Accessory-Focus-Maps
### prompt.yaml
Die fachlichen Antwortregeln, die seit p59 aus `genre.yaml` gelesen werden, wurden reduziert:
- `output_priority.technical_rules`
- `response_format.technical_rules`
- `response_format.accessory_rules`
- `fact_grounding.technical_rules`
- `fact_grounding.with_shop_rules`
Allgemeine Prompt-Struktur, Labels, Budgets, Shop-Record-Regeln und technische Renderregeln bleiben unverändert.
### search_repair.yaml
Die fachlichen Search-Repair-Listen und Muster wurden reduziert:
- `direct_product_attribute_lookup`
- `specific_model_candidate_patterns`
- genreabhängige Candidate-/Accessory-/Requested-Code-Patterns
Für skalare Regex-Fallbacks werden neutrale gültige No-Match-Platzhalter `/(?!)/u` verwendet, damit YAML-/Regex-Validierung nicht durch leere Strings fehlschlägt.
### retrieval.yaml
Die Exact-Selection-Fachbegriffe wurden reduziert:
- `exact_selection_token_variant_prefixes`
- Indicator-Frage-/Phrase-/Tabellenmuster
- Required-Primary-/Context-Terme
Technische Retrieval-Parameter und nicht genrebezogene Retrieval-Einstellungen bleiben unverändert.
### language.yaml
Die Legacy-`protected_terms` wurden auf einen leeren Fallback reduziert.
Generische Stopword-Gruppen, Cleanup-Profile, Normalisierung, ASCII-Transliteration, Separatoren und Dash-Äquivalente bleiben bewusst in `language.yaml`, weil sie Sprach-/Verarbeitungsschicht und nicht reine Genre-Pflegefläche sind.
### governance.yaml
Die fachlichen Regression-/Guardrail-Erwartungen wurden reduziert:
- `regression_baseline.*` fachliche Token, Werte, Keyword- und Shop-Query-Erwartungen
- `language.protected_stopword_terms`
- `core_pattern_audit.domain_marker_terms`
Technische Audit-Regeln wie Source Roots, excluded paths, suspicious calls, allowed literal patterns und Snippet-Limits bleiben in `governance.yaml`.
## Lokale Validierung
Ausgeführt:
```bash
python3 - <<'PY'
# YAML parse for config/retriex/*.yaml
PY
```
Ergebnis: YAML parse OK.
Zusatzcheck:
- 83 reduzierte Legacy-Pfade geprüft.
- Alle reduzierten Pfade enthalten leere Platzhalter bzw. gültige No-Match-Regex-Fallbacks.
- `genre.yaml` wurde nicht verändert.
Nicht ausführbar in diesem ZIP:
```bash
php bin/console mto:agent:config:validate
```
Grund:
```text
Dependencies are missing. Try running "composer install".
```
## Empfohlene Checks nach Einspielen
```bash
bin/console cache:clear
bin/console mto:agent:config:validate
bin/console mto:agent:regression:test
bin/console mto:agent:config:audit-source --details
bin/console mto:agent:config:audit-patterns --details
```
## Nächster Patch
Nach grünem p60:
- p61 Genre Source-of-Truth Guard
p61 sollte dann erzwingen, dass neue fachliche Listen nicht wieder außerhalb von `genre.yaml` landen und dass Legacy-Pfade leer/Fallback bleiben.

View File

@@ -100,6 +100,7 @@ parameters:
- vocabulary.classes.agent_shop_current_input_preservation_terms - vocabulary.classes.agent_shop_current_input_preservation_terms
- vocabulary.classes.agent_shop_context_anchor_trigger_terms - vocabulary.classes.agent_shop_context_anchor_trigger_terms
- agent.shop_runtime.query_cleanup.current_input_preservation.terms - agent.shop_runtime.query_cleanup.current_input_preservation.terms
- vocabulary.views.shop.semantic_search_tokens.add
- agent.shop_runtime.query_cleanup.stopword_cleanup.terms - agent.shop_runtime.query_cleanup.stopword_cleanup.terms
- agent.shop_runtime.result_identity.compound_prefix_match.terms - agent.shop_runtime.result_identity.compound_prefix_match.terms
- agent.shop_runtime.result_identity.primary_identity_repair.stop_terms - agent.shop_runtime.result_identity.primary_identity_repair.stop_terms
@@ -116,10 +117,14 @@ parameters:
- prompt.rules.response_format_accessory - prompt.rules.response_format_accessory
- prompt.rules.fact_grounding_technical - prompt.rules.fact_grounding_technical
- prompt.rules.fact_grounding_with_shop - prompt.rules.fact_grounding_with_shop
- vocabulary.views.prompt.technical_product_keywords.add
- vocabulary.views.prompt.measurement_evidence_guard.accessory_lookup_guard_terms.add - vocabulary.views.prompt.measurement_evidence_guard.accessory_lookup_guard_terms.add
- vocabulary.views.prompt.measurement_evidence_guard.accessory_lookup_passthrough_terms.add - vocabulary.views.prompt.measurement_evidence_guard.accessory_lookup_passthrough_terms.add
- vocabulary.views.prompt.measurement_evidence_guard.generic_positive_context_terms.add - vocabulary.views.prompt.measurement_evidence_guard.generic_positive_context_terms.add
- vocabulary.views.prompt.measurement_evidence_guard.generic_negative_context_terms.add - vocabulary.views.prompt.measurement_evidence_guard.generic_negative_context_terms.add
- vocabulary.maps.prompt.measurement_evidence_guard.request_terms
- vocabulary.maps.prompt.measurement_evidence_guard.positive_terms
- vocabulary.maps.prompt.measurement_evidence_guard.non_equivalent_terms
search_repair: search_repair:
description: Genre-specific repair tokens, candidate patterns and exact identifier behavior. description: Genre-specific repair tokens, candidate patterns and exact identifier behavior.
paths: paths:
@@ -148,8 +153,10 @@ parameters:
- retrieval.vocabulary.important_short_model_tokens - retrieval.vocabulary.important_short_model_tokens
- retrieval.vocabulary.family_descriptor_tokens - retrieval.vocabulary.family_descriptor_tokens
- retrieval.vocabulary.looks_like_reagent_tokens - retrieval.vocabulary.looks_like_reagent_tokens
- retrieval.vocabulary.looks_like_safety_docs
- retrieval.vocabulary.looks_like_device_words - retrieval.vocabulary.looks_like_device_words
- retrieval.vocabulary.looks_like_document_words - retrieval.vocabulary.looks_like_document_words
- retrieval.vocabulary.looks_like_safety_words
- retrieval.exact_selection_token_variant_prefixes - retrieval.exact_selection_token_variant_prefixes
- retrieval.exact_selection_indicator_question_tokens - retrieval.exact_selection_indicator_question_tokens
- retrieval.exact_selection_indicator_question_phrases - retrieval.exact_selection_indicator_question_phrases
@@ -1232,6 +1239,51 @@ parameters:
- handmessgeraete - handmessgeraete
- messkoffer - messkoffer
- koffer - koffer
semantic_shop_search_tokens:
source_paths:
- vocabulary.views.shop.semantic_search_tokens.add
terms:
- indikator
- indicator
- reagenz
- reagent
- zubehör
- zubehor
- ersatzteil
- anschlusskabel
- kabel
- sensorkabel
- elektrodenkabel
- verbrauchsmaterial
- chemie
- indikatorchemie
- reagenzchemie
- kit
- set
- filter
- pumpe
- pumpenkopf
- motorblock
- lösung
- loesung
- solution
- puffer
- pufferlösung
- pufferloesung
- kalibrierpuffer
- kalibrierlösung
- kalibrierloesung
- teststreifen
- gerät
- geraet
- messgerät
- messgeraet
- analysegerät
- analysegeraet
- analysator
- monitor
- controller
- system
direct_answer: direct_answer:
source_paths: source_paths:
- agent.shop_runtime.direct_answer.intro - agent.shop_runtime.direct_answer.intro
@@ -1341,6 +1393,55 @@ parameters:
- '- Never rename a role-incompatible accessory shop record into a main device in headings, summaries, or shop-hit lines.' - '- Never rename a role-incompatible accessory shop record into a main device in headings, summaries, or shop-hit lines.'
- '- If the user asks for the price or availability of a referenced accessory, indicator, reagent, kit, set, or consumable, use commercial fields only from a shop result that clearly matches that accessory identity and code.' - '- If the user asks for the price or availability of a referenced accessory, indicator, reagent, kit, set, or consumable, use commercial fields only from a shop result that clearly matches that accessory identity and code.'
- '- For such accessory price follow-ups, do not answer with the price, URL, product number, or availability of the main device or of unrelated reagents; if no matching accessory shop item is present, say that the price is not available in the provided shop data.' - '- For such accessory price follow-ups, do not answer with the price, URL, product number, or availability of the main device or of unrelated reagents; if no matching accessory shop item is present, say that the price is not available in the provided shop data.'
prompt_keyword_views:
source_paths:
- vocabulary.views.prompt.technical_product_keywords.add
technical_product_keywords:
- technisch
- technical
- produkt
- product
- gerät
- device
- modell
- model
- messprinzip
- measurement principle
- schnittstelle
- interface
- relais
- relay
- indikator
- indicator
- grenzwert
- threshold
- messbereich
- measurement range
- gemessen
- measured
- minimaler
- minimum
- resthärte
- resthaerte
- °dh
- dh
- spannung
- voltage
- strom
- current
- druck
- pressure
- temperatur
- temperature
- schutzart
- ip
- fehlercode
- error code
- wasserhärte
- hardness
- testomat
- chlor
- chlormessung
measurement_evidence_guard_terms: measurement_evidence_guard_terms:
source_paths: source_paths:
- vocabulary.views.prompt.measurement_evidence_guard.accessory_lookup_guard_terms.add - vocabulary.views.prompt.measurement_evidence_guard.accessory_lookup_guard_terms.add
@@ -1391,6 +1492,54 @@ parameters:
- Einsatzbedingungen - Einsatzbedingungen
- störungsfrei - störungsfrei
- stoerungsfrei - stoerungsfrei
measurement_evidence_maps:
source_paths:
- vocabulary.maps.prompt.measurement_evidence_guard.request_terms
- vocabulary.maps.prompt.measurement_evidence_guard.positive_terms
- vocabulary.maps.prompt.measurement_evidence_guard.non_equivalent_terms
request_terms:
ph:
- ph
- ph-wert
- ph wert
redox:
- redox
- orp
- oxidations-reduktionspotential
- oxidations reduktionspotential
free_chlorine:
- freies chlor
- freiem chlor
- freien chlor
- free chlorine
positive_terms:
ph:
- pH
- pH-Wert
- ph wert
redox:
- Redox
- ORP
- Oxidations-Reduktionspotential
- Oxidations Reduktionspotential
free_chlorine:
- freies Chlor
- freiem Chlor
- freien Chlor
- free chlorine
non_equivalent_terms:
ph:
- p-Wert
- p Wert
- m-Wert
- minus m-Wert
- Alkalität
- Säurekapazität
- mmol/l
free_chlorine:
- Chlor gesamt
- Gesamtchlor
- total chlorine
search_repair: search_repair:
description: Current search repair tokens, candidate patterns and exact identifier helpers. description: Current search repair tokens, candidate patterns and exact identifier helpers.
direct_product_attribute_lookup: direct_product_attribute_lookup:
@@ -1539,8 +1688,10 @@ parameters:
- vocabulary.views.retrieval.important_short_model_tokens.add - vocabulary.views.retrieval.important_short_model_tokens.add
- vocabulary.views.retrieval.family_descriptor_tokens.add - vocabulary.views.retrieval.family_descriptor_tokens.add
- vocabulary.views.retrieval.looks_like_reagent_tokens.add - vocabulary.views.retrieval.looks_like_reagent_tokens.add
- vocabulary.views.retrieval.looks_like_safety_docs.add
- vocabulary.views.retrieval.looks_like_device_words.add - vocabulary.views.retrieval.looks_like_device_words.add
- vocabulary.views.retrieval.looks_like_document_words.add - vocabulary.views.retrieval.looks_like_document_words.add
- vocabulary.views.retrieval.looks_like_safety_words.add
generic_product_tokens: generic_product_tokens:
- produkt - produkt
- produkte - produkte
@@ -1616,6 +1767,20 @@ parameters:
- kerzenfilter - kerzenfilter
- druckregler - druckregler
- ph - ph
looks_like_safety_docs:
- sicherheitsdatenblatt
- sdb
- msds
- gefahrenbewertung
- gefahrenpiktogramm
- signalwort
- lagerung
- transport
- clp
- kennzeichnung
- h290
- pbt
- vpvb
looks_like_device_words: looks_like_device_words:
- geraet - geraet
- gerät - gerät
@@ -1642,6 +1807,16 @@ parameters:
- sdb - sdb
- sicherheitsdatenblatt - sicherheitsdatenblatt
- msds - msds
looks_like_safety_words:
- gefahr
- gefahrgut
- clp
- h290
- sicherheit
- kennzeichnung
- transport
- lagerung
- piktogramm
exact_selection: exact_selection:
source_paths: source_paths:
- retrieval.exact_selection_token_variant_prefixes - retrieval.exact_selection_token_variant_prefixes

View File

@@ -4,79 +4,22 @@
parameters: parameters:
retriex.governance.config: retriex.governance.config:
regression_baseline: regression_baseline:
protected_short_model_tokens: protected_short_model_tokens: []
- th protected_measurement_values: []
- tc protected_technical_prompt_keywords: []
- tp technical_priority_required_markers: []
- tm protected_accessory_prompt_keywords: []
- ph protected_search_repair_specificity_terms: []
- rx protected_retrieval_reagent_words: []
protected_measurement_values: protected_retrieval_device_word_groups: {}
- '0,02' shop_prompt_regression_original_query: ''
- '0,05' shop_prompt_required_output_instruction_markers: []
- '0,1' shop_query_meta_guard_terms: []
- '0,25' shop_query_context_fallback_filter_terms: []
- '0,5' shop_query_current_input_preservation_terms: []
- '1,0'
- '2,0'
- '2,5'
- '5,0'
protected_technical_prompt_keywords:
- testomat
- indikator
- grenzwert
- messbereich
- gemessen
technical_priority_required_markers:
- runner-up
- second-lowest
- comparison
protected_accessory_prompt_keywords:
- indikator
- reagenz
protected_search_repair_specificity_terms:
- indikator
- testomat
- reagenz
protected_retrieval_reagent_words:
- indikator
- reagenz
protected_retrieval_device_word_groups:
geraet:
- geraet
- gerät
shop_prompt_regression_original_query: 'testomat 808 0,02'
shop_prompt_required_output_instruction_markers:
- 'Output only the final search query.'
- 'Output format:'
shop_query_meta_guard_terms:
- shop
- suche
shop_query_context_fallback_filter_terms:
- welchem
- kann
- messen
shop_query_current_input_preservation_terms:
- ph
- redox
# Protected vocabulary tokens fall back to
# regression_baseline.protected_short_model_tokens.
# Add vocabulary.protected_short_model_tokens only for an explicit override.
vocabulary: {} vocabulary: {}
language: language:
protected_stopword_terms: protected_stopword_terms: []
- nicht
- kein
- keine
- welche
- testomat
- indikator
- indikatortyp
- ph
- rx
- th
- tc
- '0,02'
required_cleanup_profiles: required_cleanup_profiles:
- commerce_query - commerce_query
- rag_evidence - rag_evidence
@@ -131,30 +74,7 @@ parameters:
- in_array - in_array
- array_intersect - array_intersect
- array_intersect_key - array_intersect_key
domain_marker_terms: domain_marker_terms: []
- testomat
- indikator
- indikatortyp
- grenzwert
- messbereich
- reagenz
- reagens
- shop
- produkt
- artikel
- kaufen
- bestellen
- geraet
- gerät
- messgerät
- messgeraet
- analysegerät
- analysegeraet
- analysator
- wasserhärte
- wasserhaerte
- chlor
- redox
allowed_literal_patterns: allowed_literal_patterns:
- path: src/Knowledge/Retrieval/NdjsonChunkLookup.php - path: src/Knowledge/Retrieval/NdjsonChunkLookup.php
pattern: '/Produkt\\s\+Titel/iu' pattern: '/Produkt\\s\+Titel/iu'

View File

@@ -55,22 +55,7 @@ parameters:
# Legacy key `words` above remains the runtime-compatible default list. # Legacy key `words` above remains the runtime-compatible default list.
# Cleanup profiles are the preferred home for generic language noise. # Cleanup profiles are the preferred home for generic language noise.
# Domain configs should only keep domain-specific overrides. # Domain configs should only keep domain-specific overrides.
protected_terms: protected_terms: []
- nicht
- kein
- keine
- welche
- testomat
- indikator
- indikatortyp
- ph
- rx
- redox
- orp
- th
- tc
- '0,02'
normalization: normalization:
# Generic language normalization tables. Keep these in YAML so PHP code # Generic language normalization tables. Keep these in YAML so PHP code
# executes normalization logic without owning language-specific lists. # executes normalization logic without owning language-specific lists.

View File

@@ -176,13 +176,7 @@ parameters:
- '- For product-selection questions such as which device can measure or monitor a parameter, use relevant live shop results as a fallback when retrieved knowledge does not identify a matching product.' - '- For product-selection questions such as which device can measure or monitor a parameter, use relevant live shop results as a fallback when retrieved knowledge does not identify a matching product.'
- '- If shop results are present, use them afterwards to add current price, availability, and the actual URL.' - '- If shop results are present, use them afterwards to add current price, availability, and the actual URL.'
- '- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.' - '- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.'
technical_rules: technical_rules: []
- '- For technical questions, answer the exact requested fact first and keep it as the main answer.'
- '- If one source chunk contains both the best matching value and nearby comparison values, use the nearby values only as context and do not include them unless the user asks for comparison or alternatives.'
- '- For lowest/highest/minimum/maximum questions, answer only the requested extreme value and the product/device explicitly connected to it.'
- '- Do not add runner-up products, second-lowest values, adjacent ranges, broader tables, or explanatory comparisons unless explicitly requested.'
- '- For a product recommendation tied to an exact numeric value, keep the recommendation anchored to records that contain that exact value/unit. Do not pull indicator codes or ranges from records for other products.'
- '- If the user asks for a suitable device/product and not for an indicator, do not add indicator names unless a same-record device-value-indicator mapping is visible.'
numeric_value_focus: numeric_value_focus:
enabled: true enabled: true
max_values: 3 max_values: 3
@@ -270,23 +264,8 @@ parameters:
- '- For uncertain technical suitability from shop hits, use a short section like "Shop-Treffer (technische Eignung nicht sicher belegt)" and list only exact shop fields. Do not add a technical explanation or recommendation.' - '- For uncertain technical suitability from shop hits, use a short section like "Shop-Treffer (technische Eignung nicht sicher belegt)" and list only exact shop fields. Do not add a technical explanation or recommendation.'
without_shop_rules: without_shop_rules:
- '- If no shop results are present, do not compensate by inventing external products or external manufacturers.' - '- If no shop results are present, do not compensate by inventing external products or external manufacturers.'
technical_rules: technical_rules: []
- '- Write like technical documentation: precise, neutral, and source-close.' accessory_rules: []
- '- Prefer exact values, ranges, thresholds, compatibility notes, and application areas over general explanation.'
- '- For direct follow-up questions about an indicator, value, threshold, or device, answer the resolved mapping first before any table or explanation.'
- '- If the sources only support a negative finding, output only that negative finding and do not add speculative alternatives.'
- '- For product-selection answers, keep the answer minimal: suitable product if explicitly supported, exact evidence, current shop fields if same product identity is clear. Do not add sections for Vorteile, Einsatzbereiche, Messprinzip, or Hinweise unless directly asked and explicitly sourced.'
- '- For product-selection answers tied to a numeric value/range, do not include an indicator field unless the same retrieved record explicitly connects the selected product, numeric value/range, and indicator code.'
accessory_rules:
- '- If the user directly asks for accessories, cables, electrodes, buffers, kits, sets, indicators, reagents, or consumables, answer the accessory request first instead of reframing it as a request for a measuring device.'
- '- For direct accessory shop searches, do not introduce Testomat, measuring-device, or main-device caveats unless the user asks for a device or the provided sources explicitly require a device context.'
- '- For direct accessory shop searches with matching shop hits, never begin with a missing-device statement; begin with the accessory hits or a short shop-only fallback sentence.'
- '- If the shop product name itself explicitly contains the requested accessory type and parameter, such as pH/Redox, treat it as a commercial accessory match and list the exact shop fields. Do not demand separate proof that the accessory itself measures the parameter.'
- '- If the user asks for a matching accessory for a named main device, separate the answer into: main device and matching accessory.'
- '- The main device must come first only when a main device is explicitly requested or named.'
- '- Only name an accessory as matching if compatibility is explicitly grounded in the provided sources.'
- '- Do not call accessories, indicators, reagents, kits, sets, or consumables a device, measuring device, or main product unless the source explicitly says
so.'
language: language:
rules: rules:
- '- Answer only in the same language as the user question.' - '- Answer only in the same language as the user question.'
@@ -313,82 +292,11 @@ parameters:
- '- If the sources do not identify a suitable product, do not invent one.' - '- If the sources do not identify a suitable product, do not invent one.'
- '- Do not turn absence of evidence into a broad portfolio statement. Use scoped wording tied to the provided sources and current search results.' - '- Do not turn absence of evidence into a broad portfolio statement. Use scoped wording tied to the provided sources and current search results.'
- '- Strong negative terms such as "ausschließlich", "keines", "nicht geeignet", or "gibt es nicht" require explicit source support for the full stated scope.' - '- Strong negative terms such as "ausschließlich", "keines", "nicht geeignet", or "gibt es nicht" require explicit source support for the full stated scope.'
with_shop_rules: with_shop_rules: []
- '- Use shop data as highest priority for current commercial fields: price, availability, URL, current shop-visible naming, and explicitly shop-visible product suitability for product-selection questions.'
- '- Use retrieved knowledge as highest priority for technical matching, thresholds, measurement principles, and technical explanation when it contains a matching product or fact.'
- '- If retrieved knowledge is silent or only contains unrelated products, but live shop results explicitly match the requested parameter/application, use the shop results and do not answer with a negative RAG-only conclusion.'
- '- If the user asks for Schwimmbad, Schwimmbecken, Pool, or typo-like pool wording, a product may only be recommended for that application when the same RAG or SHOP PRODUCT RECORD explicitly names that application. Chlor measurement alone is not proof of swimming-pool suitability.'
- '- If a product record proves Chlor measurement but not Schwimmbad, Schwimmbecken or Pool use, say exactly that distinction and avoid recommendation wording such as empfiehlt sich, geeignet für Schwimmbad, Anwendung im Schwimmbad, or für Schwimmbäder.'
- '- For product-selection questions, a shop result proves technical suitability only when the same SHOP PRODUCT RECORD explicitly states the requested measurement parameter, application, or compatibility. Search ranking, generated query terms, generic category matches, and similar wording are not proof.'
- '- If the requested parameter appears only in the generated shop query, metadata, unrelated highlights, or another product record, treat suitability as unverified and say that the shop hit requires technical verification.'
- '- Do not convert p-Wert, m-Wert, minus m-Wert, alkalinity, acid capacity, or other water-treatment parameters into pH or pH-Wert unless the same source explicitly says pH or pH-Wert.'
- '- When shop results are present and relevant, include current price and the actual URL if available.'
- '- If the shop data does not provide a positive price for a result, do not output any price for that result.'
- '- Do not let accessories, bundles, or service items override a technically better product match unless the user explicitly asks for them.'
- '- Do not call accessories, indicators, reagents, kits, sets, or consumables a device, measuring device, or main product unless the source explicitly says
so.'
- '- Do not claim that an accessory is required, necessary, used for calibration, or sets the measurement range unless this is explicitly stated in the provided
sources.'
- '- Do not assign the product number, price, URL, or availability of a reagent, accessory, kit, set, consumable, or service item to a device identified in
retrieved knowledge.'
- '- Only use commercial fields for the main product when the shop item and the technically identified product clearly refer to the same product identity.'
- '- If the shop match is ambiguous, keep the technical identification and commercial details separate.'
- '- Shop product names are authoritative for their own shop URL, product number, price, availability, image, description, and metadata.'
- '- Do not rewrite a shop record heading with a similar device name from retrieved knowledge. If identities differ or are uncertain, separate the RAG device from the shop hit.'
- '- If the user asks for a main device, measuring device, analyzer, system, or measuring installation, do not present an accessory, indicator, reagent, kit, set, consumable, or service item as the requested main solution.'
- '- If the user asks for an accessory, indicator, reagent, consumable, kit, or solution with a specific measurement parameter, do not replace the requested parameter with another parameter. A hardness indicator is not a valid answer to a pH-indicator request unless the same source explicitly states pH measurement, pH determination, pH measuring range, or an equivalent parameter-specific purpose.'
- '- Mentions of operating conditions, allowed sample pH, reagent-solution pH, storage pH, or pH values at a temperature are not measurement-parameter evidence by themselves.'
- '- If the only shop hit is role-incompatible with the requested product role, state that no matching main-device shop hit is available in the provided shop data; mention the incompatible hit only as a separate accessory/consumable hit if useful.'
- '- If a SHOP PRODUCT RECORD says Commercial fields suppressed, do not output its price, availability, URL, product number, image, or metadata anywhere in the answer.'
- '- Never write shop-hit lines such as price, availability, URL, product number, or Shop-Treffer below a RAG device unless the same exact SHOP PRODUCT RECORD names that device as the exact shop product.'
- '- Never rename a role-incompatible accessory shop record into a main device in headings, summaries, or shop-hit lines.'
- '- If the user asks for the price or availability of a referenced accessory, indicator, reagent, kit, set, or consumable, use commercial fields only from a shop result that clearly matches that accessory identity and code.'
- '- For such accessory price follow-ups, do not answer with the price, URL, product number, or availability of the main device or of unrelated reagents; if no matching accessory shop item is present, say that the price is not available in the provided shop data.'
without_shop_rules: without_shop_rules:
- '- Use retrieved knowledge as authoritative for factual answers.' - '- Use retrieved knowledge as authoritative for factual answers.'
- '- If no shop results are present, do not compensate with external recommendations or external product suggestions.' - '- If no shop results are present, do not compensate with external recommendations or external product suggestions.'
technical_rules: technical_rules: []
- '- For technical product questions, answer primarily with explicitly stated facts.'
- '- For measurement-parameter questions, do not treat similar or neighboring abbreviations as equivalent. In particular, p-Wert is not pH-Wert unless the source explicitly says pH or pH-Wert.'
- '- Do not invent or infer measurement principles, methods, calibration functions, benefits, advantages, application areas, or alternative products from product family names, search rank, or shop query wording.'
- '- Behave like a technical documentation assistant, not like a sales advisor.'
- '- Keep interpretations minimal and do not generalize application areas beyond the provided sources.'
- '- Do not describe benefits, consequences, risks, or operational outcomes unless they are explicitly stated in the sources.'
- '- Do not translate technical facts into business value unless the source explicitly does so.'
- '- Do not recommend process changes unless explicitly present in the source.'
- '- Do not use persuasive summaries or advisory conclusions.'
- '- If the retrieved knowledge describes one specific named product, stay within that product and do not merge related product families or variants.'
- '- Use neutral engineering language.'
- '- Do not name specific chemicals, indicator substances, standards, or mechanisms unless explicitly stated in the source.'
- '- If the source states signal logic such as green/red, output that signal logic only and do not expand it into operational recommendations or alarm semantics
unless explicitly stated.'
- '- If the source lists application areas, repeat only those areas and do not broaden them.'
- '- If the source names an indicator and threshold, reproduce that exactly without extrapolation.'
- '- For lowest, highest, smallest, largest, minimum, maximum, Grenzwert, Messbereich or Aufloesung questions, first identify the exact numeric extreme from
the retrieved knowledge and answer that value directly.'
- '- For lowest/highest/minimum/maximum questions, answer only the requested extreme unless the user explicitly asks for a comparison or alternatives.'
- '- For direct numeric lookup questions such as which device measures a given threshold, answer with the exact matching device/value pair first and avoid advisory
caveats.'
- '- For product recommendations based on an exact numeric value, use only same-record evidence for the recommended product. Do not import indicator names, ranges, or variants from higher-ranked records that describe different products.'
- '- Do not add the runner-up product, second-lowest value, or adjacent range unless the user asks for it.'
- '- Do not add calibration, accuracy, pretreatment, temperature, or application notes unless those exact notes are requested and explicitly present in the
retrieved source.'
- '- For follow-up questions such as "which indicator measures that value", first resolve the referenced value/device, then use the retrieved source entry that
explicitly connects value, device and indicator.'
- '- For direct follow-up indicator/value questions, start with the exact mapping in one sentence, for example: Der Wert 0,02 °dH wird beim Testomat 808 mit Indikatortyp 300 gemessen.'
- '- Do not output the full indicator table, measurement principle, application areas, or advisory notes unless the user explicitly asks for all indicators, details, a table, or device information.'
- '- For numeric extreme questions, do not combine a value, device name, indicator name, range or product variant from different chunks unless the same retrieved
entry explicitly connects them.'
- '- For exact-value product recommendations, if the retrieved record only supports product plus value/range, answer product plus value/range only; indicator details require an explicit same-record product-value-indicator mapping.'
- '- If several devices or indicators are present, keep each device-indicator-range assignment separate and do not transfer an indicator from one product to
another.'
- '- For Testomat CAL or Testomat 2000 CAL threshold/range questions, use only source entries that explicitly name CAL or Testomat 2000 CAL in the same product record. Do not answer with Testomat 808 indicator ranges or the generic 0,02 °dH to 5 °dH range unless a CAL source record explicitly contains that exact assignment.'
- '- If the retrieved CAL-specific source records do not explicitly state numeric CAL threshold values, say that the exact CAL Grenzwerte are not belegbar from the provided sources instead of giving 0,02 °dH to 5 °dH as a typical range.'
- '- For Testomat CAL, do not transfer plain numeric 808 indicator types such as 300, 300 S, 301, 302, 303, 305, 310, 320, 330, or 350. CAL indicator statements must remain CAL-specific, for example TH-prefixed indicator codes when those are explicitly present in the CAL source.'
- '- If the source states only a threshold function, do not expand it into broader control logic.'
- '- If a detail is not explicitly stated in the provided sources, say so plainly.'
- '- Prefer short, source-close sentences over explanatory expansion.'
- '- If the sources only support that a product family is not suitable, output only that unsuitability and stop there.'
retrieved_knowledge: retrieved_knowledge:
source_line: 'Source: Documents' source_line: 'Source: Documents'
url_content: url_content:

View File

@@ -36,16 +36,7 @@ parameters:
- '/\bwelche\b.*\b(gibt|verfügbar|verfuegbar|existieren)\b/u' - '/\bwelche\b.*\b(gibt|verfügbar|verfuegbar|existieren)\b/u'
- '/\bzeige\b.*\b(produkte|geraete|geräte|modelle|artikel)\b/u' - '/\bzeige\b.*\b(produkte|geraete|geräte|modelle|artikel)\b/u'
- '/\bwas\b.*\b(gibt es|verfügbar|verfuegbar)\b/u' - '/\bwas\b.*\b(gibt es|verfügbar|verfuegbar)\b/u'
exact_selection_token_variant_prefixes: exact_selection_token_variant_prefixes: {}
indikator:
- indikator
- indikatortyp
grenzwert:
- grenzwert
messbereich:
- messbereich
testomat:
- testomat
exact_selection_token_variant_suffixes: exact_selection_token_variant_suffixes:
- typen - typen
- innen - innen
@@ -57,27 +48,13 @@ parameters:
- e - e
- s - s
- n - n
exact_selection_indicator_question_tokens: exact_selection_indicator_question_tokens: []
- indikator exact_selection_indicator_question_phrases: []
- indikatortyp exact_selection_indicator_table_heading_patterns: []
- reagenz exact_selection_indicator_table_header_patterns: []
- reagens exact_selection_indicator_table_row_patterns: []
- chemie exact_selection_indicator_table_required_primary_terms: []
exact_selection_indicator_question_phrases: exact_selection_indicator_table_required_context_terms: []
- mit welchem
- womit
exact_selection_indicator_table_heading_patterns:
- '/verf(?:ü|ue)gbare\s+indikatortypen|indikatortypen|indikatorvarianten/iu'
exact_selection_indicator_table_header_patterns:
- '/\|\s*(?:typ|indikator)\s*\|\s*(?:grenzwert|messbereich|bereich)/iu'
exact_selection_indicator_table_row_patterns:
- '/\|\s*[A-Z]{0,4}\s*\d{2,4}\s*[A-Z]?\s*\|\s*\d/iu'
exact_selection_indicator_table_required_primary_terms:
- indikator
exact_selection_indicator_table_required_context_terms:
- grenzwert
- messbereich
- bereich
exact_detail_tokens: exact_detail_tokens:
- indikator - indikator
- indikatoren - indikatoren

View File

@@ -5,18 +5,7 @@ parameters:
strict_requested_accessory_code_repair: true strict_requested_accessory_code_repair: true
prefer_prompt_anchored_model_for_requested_accessory_code: true prefer_prompt_anchored_model_for_requested_accessory_code: true
direct_product_attribute_lookup: direct_product_attribute_lookup: {}
enabled: true
min_query_tokens_after_cleanup: 2
# Query repair must stay on the requested product/accessory type for
# direct attribute lookups. It may relax comparative constraints, but it
# must not expand to unrelated RAG model/device candidates.
# Direct product/accessory stop terms are resolved from
# config/retriex/vocabulary.yaml view search_repair.direct_product_attribute_stop_terms.
# A local stop_terms list may still be added here as an explicit project override.
comparative_constraint_patterns:
- '/\b(?:länger|laenger|kürzer|kuerzer|größer|groesser|kleiner|über|ueber|unter|mindestens|maximal|maximum|minimum|ab|bis|mehr\s+als|weniger\s+als)\s+(?P<value>\d+(?:[,.]\d+)?\s*[\p{L}µ°%]*)\b/iu'
requested_accessory_code_fallback_query_templates: requested_accessory_code_fallback_query_templates:
- '{term} {code}' - '{term} {code}'
# Requested-accessory code terms are resolved from # Requested-accessory code terms are resolved from
@@ -26,13 +15,7 @@ parameters:
# explicit project overrides. # explicit project overrides.
requested_accessory_code_proximity_window: 1600 requested_accessory_code_proximity_window: 1600
specific_model_candidate_patterns: specific_model_candidate_patterns: []
- '/\b([A-Za-zÄÖÜäöüß][A-Za-zÄÖÜäöüß®\-]*(?:\s+[A-Za-zÄÖÜäöüß0-9][A-Za-zÄÖÜäöüß0-9®\-]*){0,3}\s+\d{2,5}(?:\s+[A-ZÄÖÜ]{1,8})?)\b/u'
# Model-candidate exclude terms are resolved from
# config/retriex/vocabulary.yaml view search_repair.model_candidate_exclude_terms.
# A local model_candidate_exclude_terms list may still be added here as an
# explicit project override.
limits: limits:
top_product_log_limit: 3 top_product_log_limit: 3
@@ -55,12 +38,12 @@ parameters:
numeric_token_match_score: 4 numeric_token_match_score: 4
patterns: patterns:
model_candidate: '/\b([A-Za-zÄÖÜäöüß][A-Za-zÄÖÜäöüß®\-]*(?:\s+[A-Za-zÄÖÜäöüß][A-Za-zÄÖÜäöüß®\-]*){0,2}\s+\d{2,5}[A-Za-z0-9\-]*)\b/u' model_candidate: '/(?!)/u'
accessory_candidate_template: '/\b((?:{terms})\s+\d{1,5}[A-Za-z0-9\-]*)\b/iu' accessory_candidate_template: '/(?!)/u'
requested_accessory_code: '/\b(?:indikator(?:typ)?|indicator(?:\s*type)?|reagenz|reagent)\s*([A-Za-z]{0,3}\s*\d{1,5}[A-Za-z0-9\-]*)\b/iu' requested_accessory_code: '/(?!)/u'
accessory_or_bundle_template: '/\b({terms})\b/iu' accessory_or_bundle_template: '/(?!)/u'
model_like: '/\b[A-Za-zÄÖÜäöüß][A-Za-zÄÖÜäöüß®\-]*(?:\s+[A-Za-zÄÖÜäöüß][A-Za-zÄÖÜäöüß®\-]*){0,2}\s+\d{2,5}[A-Za-z0-9\-]*\b/u' model_like: '/(?!)/u'
specificity_boost_template: '/\b(?:{terms})\b/iu' specificity_boost_template: '/(?!)/u'
contains_digit: '/\d/u' contains_digit: '/\d/u'
whitespace_collapse: '/\s+/u' whitespace_collapse: '/\s+/u'
tokenize_cleanup: '/[^\p{L}\p{N}\s\-]+/u' tokenize_cleanup: '/[^\p{L}\p{N}\s\-]+/u'

File diff suppressed because it is too large Load Diff

View File

@@ -138,6 +138,7 @@ services:
arguments: arguments:
$config: '%retriex.prompt.config%' $config: '%retriex.prompt.config%'
$vocabulary: '@App\Config\DomainVocabularyConfig' $vocabulary: '@App\Config\DomainVocabularyConfig'
$genreConfig: '@App\Config\GenreConfig'
App\Config\AgentRunnerConfig: App\Config\AgentRunnerConfig:
arguments: arguments:
@@ -149,6 +150,7 @@ services:
arguments: arguments:
$config: '%retriex.retrieval.config%' $config: '%retriex.retrieval.config%'
$vocabulary: '@App\Config\DomainVocabularyConfig' $vocabulary: '@App\Config\DomainVocabularyConfig'
$genreConfig: '@App\Config\GenreConfig'
App\Config\StopWordsConfig: App\Config\StopWordsConfig:
arguments: arguments:
@@ -157,6 +159,7 @@ services:
App\Config\LanguageCleanupConfig: App\Config\LanguageCleanupConfig:
arguments: arguments:
$config: '%retriex.stopwords.config%' $config: '%retriex.stopwords.config%'
$genreConfig: '@App\Config\GenreConfig'
App\Config\QueryEnricherConfig: App\Config\QueryEnricherConfig:
arguments: arguments:
@@ -166,6 +169,7 @@ services:
App\Config\GovernanceConfig: App\Config\GovernanceConfig:
arguments: arguments:
$config: '%retriex.governance.config%' $config: '%retriex.governance.config%'
$genreConfig: '@App\Config\GenreConfig'
App\Config\ShopServiceConfig: App\Config\ShopServiceConfig:
arguments: arguments:
@@ -234,6 +238,7 @@ services:
$maxRepairQueries: '%retriex.commerce.search_repair.max_queries%' $maxRepairQueries: '%retriex.commerce.search_repair.max_queries%'
$minPrimaryResultsWithoutRepair: '%retriex.commerce.search_repair.min_primary_results_without_repair%' $minPrimaryResultsWithoutRepair: '%retriex.commerce.search_repair.min_primary_results_without_repair%'
$config: '%retriex.search_repair.config%' $config: '%retriex.search_repair.config%'
$genreConfig: '@App\Config\GenreConfig'
App\Commerce\SearchRepairService: ~ App\Commerce\SearchRepairService: ~
@@ -244,6 +249,7 @@ services:
App\Config\SalesIntentConfig: App\Config\SalesIntentConfig:
arguments: arguments:
$config: '%retriex.intent.sales.config%' $config: '%retriex.intent.sales.config%'
$genreConfig: '@App\Config\GenreConfig'
App\Shopware\ShopwareCriteriaBuilder: ~ App\Shopware\ShopwareCriteriaBuilder: ~

View File

@@ -86,9 +86,9 @@ input, textarea, select {
overflow-y: auto; overflow-y: auto;
padding: 1rem; padding: 1rem;
background: #121a25; background: #121a25;
border: 1px solid var(--border); /*border: 1px solid var(--border);*/
border-radius: 6px 6px 0 0; border-radius: 6px 6px 0 0;
box-shadow: 0px 0px 20px #ffffff26; /*box-shadow: 0px 0px 20px #ffffff26;*/
} }
.message { .message {

View File

@@ -75,7 +75,8 @@ final class AgentRunnerConfig
*/ */
public function getCommercialTableFollowUpHistoryAnchorPatterns(): array public function getCommercialTableFollowUpHistoryAnchorPatterns(): array
{ {
return $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.history_anchor_patterns'); return $this->genreStringList('context_resolution.commercial_table_follow_up.history_anchor_patterns')
?: $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.history_anchor_patterns');
} }
/** /**
@@ -102,17 +103,20 @@ final class AgentRunnerConfig
*/ */
public function getCommercialTableFollowUpIndicatorMarkerPatterns(): array public function getCommercialTableFollowUpIndicatorMarkerPatterns(): array
{ {
return $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.indicator_marker_patterns'); return $this->genreStringList('context_resolution.commercial_table_follow_up.indicator_marker_patterns')
?: $this->getRequiredStringList('follow_up_context.commercial_table_follow_up.indicator_marker_patterns');
} }
public function getCommercialTableFollowUpQueryTemplateWithModel(): string public function getCommercialTableFollowUpQueryTemplateWithModel(): string
{ {
return $this->getRequiredString('follow_up_context.commercial_table_follow_up.query_template_with_model'); return $this->genreString('context_resolution.commercial_table_follow_up.query_template_with_model')
?: $this->getRequiredString('follow_up_context.commercial_table_follow_up.query_template_with_model');
} }
public function getCommercialTableFollowUpQueryTemplateWithoutModel(): string public function getCommercialTableFollowUpQueryTemplateWithoutModel(): string
{ {
return $this->getRequiredString('follow_up_context.commercial_table_follow_up.query_template_without_model'); return $this->genreString('context_resolution.commercial_table_follow_up.query_template_without_model')
?: $this->getRequiredString('follow_up_context.commercial_table_follow_up.query_template_without_model');
} }
public function getFollowUpHistoryQuestionPattern(): string public function getFollowUpHistoryQuestionPattern(): string

View File

@@ -116,7 +116,7 @@ final class CommerceIntentConfig
{ {
return $this->renderPatternTemplate('patterns.size_extraction_template', [ return $this->renderPatternTemplate('patterns.size_extraction_template', [
'size_pattern' => $this->getSizePattern(), 'size_pattern' => $this->getSizePattern(),
]); ], 'product_attributes.size_and_color_terms.patterns.size_extraction_template');
} }
/** @return string[] */ /** @return string[] */
@@ -148,21 +148,21 @@ final class CommerceIntentConfig
{ {
return $this->renderPatternTemplate('patterns.size_value_template', [ return $this->renderPatternTemplate('patterns.size_value_template', [
'size_pattern' => $this->getSizePattern(), 'size_pattern' => $this->getSizePattern(),
]); ], 'product_attributes.size_and_color_terms.patterns.size_value_template');
} }
public function getSizeTokenValuePattern(): string public function getSizeTokenValuePattern(): string
{ {
return $this->renderPatternTemplate('patterns.size_token_value_template', [ return $this->renderPatternTemplate('patterns.size_token_value_template', [
'size_token_pattern' => $this->getSizeTokenPattern(), 'size_token_pattern' => $this->getSizeTokenPattern(),
]); ], 'product_attributes.size_and_color_terms.patterns.size_token_value_template');
} }
public function getColorValuePattern(): string public function getColorValuePattern(): string
{ {
return $this->renderPatternTemplate('patterns.color_value_template', [ return $this->renderPatternTemplate('patterns.color_value_template', [
'color_pattern' => $this->getColorPattern(), 'color_pattern' => $this->getColorPattern(),
]); ], 'product_attributes.size_and_color_terms.patterns.color_value_template');
} }
public function getSupportOrDiagnosticSignalLabel(): string public function getSupportOrDiagnosticSignalLabel(): string
@@ -257,7 +257,8 @@ final class CommerceIntentConfig
public function getModelLikeProductPattern(): string public function getModelLikeProductPattern(): string
{ {
return $this->requiredString('patterns.model_like_product'); return $this->genreConfig?->getValueString('intent_and_routing.commerce_intent.model_like_product_pattern')
?: $this->requiredString('patterns.model_like_product');
} }
public function getModelLikeProductSignalLabel(): string public function getModelLikeProductSignalLabel(): string
@@ -327,9 +328,12 @@ final class CommerceIntentConfig
/** /**
* @param array<string, string> $replacements * @param array<string, string> $replacements
*/ */
private function renderPatternTemplate(string $key, array $replacements): string private function renderPatternTemplate(string $key, array $replacements, ?string $genrePath = null): string
{ {
$template = $genrePath !== null ? ($this->genreConfig?->getValueString($genrePath) ?? '') : '';
if ($template === '') {
$template = $this->requiredString($key); $template = $this->requiredString($key);
}
$replace = []; $replace = [];
foreach ($replacements as $placeholder => $value) { foreach ($replacements as $placeholder => $value) {
$replace['{' . $placeholder . '}'] = $value; $replace['{' . $placeholder . '}'] = $value;

View File

@@ -32,6 +32,21 @@ final class DomainVocabularyConfig
'search_repair.direct_product_type_terms' => 'product_attributes.direct_attribute_cleanup.product_type_terms', 'search_repair.direct_product_type_terms' => 'product_attributes.direct_attribute_cleanup.product_type_terms',
'search_repair.direct_product_attribute_stop_terms' => 'product_attributes.direct_attribute_cleanup.stop_terms', 'search_repair.direct_product_attribute_stop_terms' => 'product_attributes.direct_attribute_cleanup.stop_terms',
'search_repair.requested_accessory_code_terms' => 'product_roles.requested_accessory_code_terms.terms', 'search_repair.requested_accessory_code_terms' => 'product_roles.requested_accessory_code_terms.terms',
'search_repair.generic_candidate_tokens' => 'search_repair.candidate_terms.generic_candidate_tokens',
'search_repair.accessory_candidate_terms' => 'search_repair.candidate_terms.accessory_candidate_terms',
'search_repair.accessory_or_bundle_terms' => 'search_repair.candidate_terms.accessory_or_bundle_terms',
'search_repair.specificity_boost_terms' => 'search_repair.candidate_terms.specificity_boost_terms',
'shop.semantic_search_tokens' => 'shop_query_runtime.semantic_shop_search_tokens.terms',
'retrieval.generic_product_tokens' => 'retrieval_and_language.retrieval_vocabulary_views.generic_product_tokens',
'retrieval.important_short_model_tokens' => 'retrieval_and_language.retrieval_vocabulary_views.important_short_model_tokens',
'retrieval.family_descriptor_tokens' => 'retrieval_and_language.retrieval_vocabulary_views.family_descriptor_tokens',
'retrieval.looks_like_reagent_tokens' => 'retrieval_and_language.retrieval_vocabulary_views.looks_like_reagent_tokens',
'retrieval.looks_like_safety_docs' => 'retrieval_and_language.retrieval_vocabulary_views.looks_like_safety_docs',
'retrieval.looks_like_reagent_words' => 'retrieval_and_language.retrieval_vocabulary_views.looks_like_reagent_words',
'retrieval.looks_like_document_words' => 'retrieval_and_language.retrieval_vocabulary_views.looks_like_document_words',
'retrieval.looks_like_safety_words' => 'retrieval_and_language.retrieval_vocabulary_views.looks_like_safety_words',
'retrieval.looks_like_device_words' => 'retrieval_and_language.retrieval_vocabulary_views.looks_like_device_words',
'prompt.technical_product_keywords' => 'result_identity_and_answer_policy.prompt_keyword_views.technical_product_keywords',
'agent.rag_evidence_guard.accessory_lookup_guard_terms' => 'result_identity_and_answer_policy.measurement_evidence_guard_terms.accessory_lookup_guard_terms', 'agent.rag_evidence_guard.accessory_lookup_guard_terms' => 'result_identity_and_answer_policy.measurement_evidence_guard_terms.accessory_lookup_guard_terms',
'agent.rag_evidence_guard.accessory_lookup_passthrough_terms' => 'result_identity_and_answer_policy.measurement_evidence_guard_terms.accessory_lookup_passthrough_terms', 'agent.rag_evidence_guard.accessory_lookup_passthrough_terms' => 'result_identity_and_answer_policy.measurement_evidence_guard_terms.accessory_lookup_passthrough_terms',
'agent.rag_evidence_guard.generic_positive_context_terms' => 'result_identity_and_answer_policy.measurement_evidence_guard_terms.generic_positive_context_terms', 'agent.rag_evidence_guard.generic_positive_context_terms' => 'result_identity_and_answer_policy.measurement_evidence_guard_terms.generic_positive_context_terms',
@@ -41,6 +56,13 @@ final class DomainVocabularyConfig
private const MAP_GENRE_VALUE_PATHS = [ private const MAP_GENRE_VALUE_PATHS = [
'shop.accessory_focus_variants' => 'brands_and_canonical_terms.accessory_focus_variants.map', 'shop.accessory_focus_variants' => 'brands_and_canonical_terms.accessory_focus_variants.map',
'agent.rag_evidence_guard.synonyms' => 'brands_and_canonical_terms.rag_evidence_synonyms.map', 'agent.rag_evidence_guard.synonyms' => 'brands_and_canonical_terms.rag_evidence_synonyms.map',
'prompt.measurement_evidence_guard.request_terms' => 'result_identity_and_answer_policy.measurement_evidence_maps.request_terms',
'prompt.measurement_evidence_guard.positive_terms' => 'result_identity_and_answer_policy.measurement_evidence_maps.positive_terms',
'prompt.measurement_evidence_guard.non_equivalent_terms' => 'result_identity_and_answer_policy.measurement_evidence_maps.non_equivalent_terms',
];
private const VIEW_GENRE_INCLUDE_CLASS_PATHS = [
'search_repair.model_candidate_exclude_terms' => 'search_repair.candidate_terms.model_candidate_exclude_terms',
]; ];
public function __construct( public function __construct(
@@ -181,13 +203,28 @@ final class DomainVocabularyConfig
/** @return string[] */ /** @return string[] */
private function genreStringListForView(string $path): array private function genreStringListForView(string $path): array
{ {
if ($this->genreConfig === null || !isset(self::VIEW_GENRE_VALUE_PATHS[$path])) { if ($this->genreConfig === null) {
return []; return [];
} }
if (isset(self::VIEW_GENRE_VALUE_PATHS[$path])) {
return $this->genreConfig->getValueStringList(self::VIEW_GENRE_VALUE_PATHS[$path]); return $this->genreConfig->getValueStringList(self::VIEW_GENRE_VALUE_PATHS[$path]);
} }
if (!isset(self::VIEW_GENRE_INCLUDE_CLASS_PATHS[$path])) {
return [];
}
$terms = [];
foreach ($this->genreConfig->getValueStringList(self::VIEW_GENRE_INCLUDE_CLASS_PATHS[$path]) as $className) {
foreach ($this->domainClass($className) as $term) {
$terms[] = $term;
}
}
return $this->uniqueStringList($terms);
}
/** @return array<string, string[]> */ /** @return array<string, string[]> */
private function genreStringListMapForMap(string $path): array private function genreStringListMapForMap(string $path): array
{ {

View File

@@ -9,8 +9,10 @@ final class GovernanceConfig
/** /**
* @param array<string, mixed> $config * @param array<string, mixed> $config
*/ */
public function __construct(private readonly array $config = []) public function __construct(
{ private readonly array $config = [],
private readonly ?GenreConfig $genreConfig = null,
) {
} }
/** @return array<string, mixed> */ /** @return array<string, mixed> */
@@ -22,49 +24,57 @@ final class GovernanceConfig
/** @return string[] */ /** @return string[] */
public function getRegressionProtectedShortModelTokens(): array public function getRegressionProtectedShortModelTokens(): array
{ {
return $this->requiredStringList('regression_baseline.protected_short_model_tokens'); return $this->genreStringList('governance_and_regression.regression_baseline.protected_short_model_tokens')
?: $this->requiredStringList('regression_baseline.protected_short_model_tokens');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionProtectedMeasurementValues(): array public function getRegressionProtectedMeasurementValues(): array
{ {
return $this->requiredStringList('regression_baseline.protected_measurement_values'); return $this->genreStringList('governance_and_regression.regression_baseline.protected_measurement_values')
?: $this->requiredStringList('regression_baseline.protected_measurement_values');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionProtectedTechnicalPromptKeywords(): array public function getRegressionProtectedTechnicalPromptKeywords(): array
{ {
return $this->requiredStringList('regression_baseline.protected_technical_prompt_keywords'); return $this->genreStringList('governance_and_regression.regression_baseline.protected_technical_prompt_keywords')
?: $this->requiredStringList('regression_baseline.protected_technical_prompt_keywords');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionTechnicalPriorityRequiredMarkers(): array public function getRegressionTechnicalPriorityRequiredMarkers(): array
{ {
return $this->requiredStringList('regression_baseline.technical_priority_required_markers'); return $this->genreStringList('governance_and_regression.regression_baseline.technical_priority_required_markers')
?: $this->requiredStringList('regression_baseline.technical_priority_required_markers');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionProtectedAccessoryPromptKeywords(): array public function getRegressionProtectedAccessoryPromptKeywords(): array
{ {
return $this->requiredStringList('regression_baseline.protected_accessory_prompt_keywords'); return $this->genreStringList('governance_and_regression.regression_baseline.protected_accessory_prompt_keywords')
?: $this->requiredStringList('regression_baseline.protected_accessory_prompt_keywords');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionProtectedSearchRepairSpecificityTerms(): array public function getRegressionProtectedSearchRepairSpecificityTerms(): array
{ {
return $this->requiredStringList('regression_baseline.protected_search_repair_specificity_terms'); return $this->genreStringList('governance_and_regression.regression_baseline.protected_search_repair_specificity_terms')
?: $this->requiredStringList('regression_baseline.protected_search_repair_specificity_terms');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionProtectedRetrievalReagentWords(): array public function getRegressionProtectedRetrievalReagentWords(): array
{ {
return $this->requiredStringList('regression_baseline.protected_retrieval_reagent_words'); return $this->genreStringList('governance_and_regression.regression_baseline.protected_retrieval_reagent_words')
?: $this->requiredStringList('regression_baseline.protected_retrieval_reagent_words');
} }
/** @return array<string, string[]> */ /** @return array<string, string[]> */
public function getRegressionProtectedRetrievalDeviceWordGroups(): array public function getRegressionProtectedRetrievalDeviceWordGroups(): array
{ {
$value = $this->requiredValue('regression_baseline.protected_retrieval_device_word_groups'); $value = $this->genreArray('governance_and_regression.regression_baseline.protected_retrieval_device_word_groups')
?: $this->requiredValue('regression_baseline.protected_retrieval_device_word_groups');
if (!is_array($value)) { if (!is_array($value)) {
throw $this->invalid('regression_baseline.protected_retrieval_device_word_groups', 'must be a map of string lists'); throw $this->invalid('regression_baseline.protected_retrieval_device_word_groups', 'must be a map of string lists');
} }
@@ -99,31 +109,36 @@ final class GovernanceConfig
public function getRegressionShopPromptOriginalQuery(): string public function getRegressionShopPromptOriginalQuery(): string
{ {
return $this->requiredString('regression_baseline.shop_prompt_regression_original_query'); return $this->genreString('governance_and_regression.regression_baseline.shop_prompt_regression_original_query')
?: $this->requiredString('regression_baseline.shop_prompt_regression_original_query');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionShopPromptRequiredOutputInstructionMarkers(): array public function getRegressionShopPromptRequiredOutputInstructionMarkers(): array
{ {
return $this->requiredStringList('regression_baseline.shop_prompt_required_output_instruction_markers'); return $this->genreStringList('governance_and_regression.regression_baseline.shop_prompt_required_output_instruction_markers')
?: $this->requiredStringList('regression_baseline.shop_prompt_required_output_instruction_markers');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionShopQueryMetaGuardTerms(): array public function getRegressionShopQueryMetaGuardTerms(): array
{ {
return $this->requiredStringList('regression_baseline.shop_query_meta_guard_terms'); return $this->genreStringList('governance_and_regression.regression_baseline.shop_query_meta_guard_terms')
?: $this->requiredStringList('regression_baseline.shop_query_meta_guard_terms');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionShopQueryContextFallbackFilterTerms(): array public function getRegressionShopQueryContextFallbackFilterTerms(): array
{ {
return $this->requiredStringList('regression_baseline.shop_query_context_fallback_filter_terms'); return $this->genreStringList('governance_and_regression.regression_baseline.shop_query_context_fallback_filter_terms')
?: $this->requiredStringList('regression_baseline.shop_query_context_fallback_filter_terms');
} }
/** @return string[] */ /** @return string[] */
public function getRegressionShopQueryCurrentInputPreservationTerms(): array public function getRegressionShopQueryCurrentInputPreservationTerms(): array
{ {
return $this->requiredStringList('regression_baseline.shop_query_current_input_preservation_terms'); return $this->genreStringList('governance_and_regression.regression_baseline.shop_query_current_input_preservation_terms')
?: $this->requiredStringList('regression_baseline.shop_query_current_input_preservation_terms');
} }
/** @return string[] */ /** @return string[] */
@@ -138,7 +153,8 @@ final class GovernanceConfig
/** @return string[] */ /** @return string[] */
public function getLanguageProtectedStopwordTerms(): array public function getLanguageProtectedStopwordTerms(): array
{ {
return $this->requiredStringList('language.protected_stopword_terms'); return $this->genreStringList('retrieval_and_language.protected_terms.terms')
?: $this->requiredStringList('language.protected_stopword_terms');
} }
/** @return string[] */ /** @return string[] */
@@ -241,7 +257,8 @@ final class GovernanceConfig
/** @return string[] */ /** @return string[] */
public function getCorePatternAuditDomainMarkerTerms(): array public function getCorePatternAuditDomainMarkerTerms(): array
{ {
return $this->requiredStringList('core_pattern_audit.domain_marker_terms'); return $this->genreStringList('governance_and_regression.core_pattern_audit.domain_marker_terms')
?: $this->requiredStringList('core_pattern_audit.domain_marker_terms');
} }
/** @return array<int, array{path:string, pattern:string, reason:string}> */ /** @return array<int, array{path:string, pattern:string, reason:string}> */
@@ -288,6 +305,23 @@ final class GovernanceConfig
return $this->requiredInt('core_pattern_audit.max_snippet_length', 20); return $this->requiredInt('core_pattern_audit.max_snippet_length', 20);
} }
/** @return string[] */
private function genreStringList(string $path): array
{
return $this->genreConfig?->getValueStringList($path) ?? [];
}
private function genreString(string $path): string
{
return $this->genreConfig?->getValueString($path) ?? '';
}
/** @return array<int|string, mixed> */
private function genreArray(string $path): array
{
return $this->genreConfig?->getValueArray($path) ?? [];
}
private function requiredInt(string $path, int $min = PHP_INT_MIN): int private function requiredInt(string $path, int $min = PHP_INT_MIN): int
{ {
$value = $this->requiredValue($path); $value = $this->requiredValue($path);

View File

@@ -18,8 +18,10 @@ final class LanguageCleanupConfig
/** /**
* @param array<string, mixed> $config * @param array<string, mixed> $config
*/ */
public function __construct(private readonly array $config) public function __construct(
{ private readonly array $config,
private readonly ?GenreConfig $genreConfig = null,
) {
} }
/** @return string[] */ /** @return string[] */
@@ -31,7 +33,8 @@ final class LanguageCleanupConfig
/** @return string[] */ /** @return string[] */
public function getProtectedTerms(): array public function getProtectedTerms(): array
{ {
return $this->requiredTopLevelStringList('protected_terms'); return $this->genreConfig?->getValueStringList('retrieval_and_language.protected_terms.terms')
?: $this->requiredTopLevelStringList('protected_terms');
} }
public function isProtectedTerm(string $term): bool public function isProtectedTerm(string $term): bool

View File

@@ -14,6 +14,7 @@ final class NdjsonHybridRetrieverConfig
public function __construct( public function __construct(
private array $config = [], private array $config = [],
private ?DomainVocabularyConfig $vocabulary = null, private ?DomainVocabularyConfig $vocabulary = null,
private ?GenreConfig $genreConfig = null,
) { ) {
} }
@@ -146,7 +147,8 @@ final class NdjsonHybridRetrieverConfig
/** @return array<string, string[]> */ /** @return array<string, string[]> */
public function exactSelectionTokenVariantPrefixes(): array public function exactSelectionTokenVariantPrefixes(): array
{ {
return $this->requiredStringListMap('exact_selection_token_variant_prefixes'); return $this->genreStringListMap('retrieval_and_language.exact_selection.token_variant_prefixes')
?: $this->requiredStringListMap('exact_selection_token_variant_prefixes');
} }
/** @return string[] */ /** @return string[] */
@@ -158,43 +160,50 @@ final class NdjsonHybridRetrieverConfig
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorQuestionTokens(): array public function exactSelectionIndicatorQuestionTokens(): array
{ {
return $this->requiredStringList('exact_selection_indicator_question_tokens'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_question_tokens')
?: $this->requiredStringList('exact_selection_indicator_question_tokens');
} }
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorQuestionPhrases(): array public function exactSelectionIndicatorQuestionPhrases(): array
{ {
return $this->requiredStringList('exact_selection_indicator_question_phrases'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_question_phrases')
?: $this->requiredStringList('exact_selection_indicator_question_phrases');
} }
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorTableHeadingPatterns(): array public function exactSelectionIndicatorTableHeadingPatterns(): array
{ {
return $this->requiredStringList('exact_selection_indicator_table_heading_patterns'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_table_heading_patterns')
?: $this->requiredStringList('exact_selection_indicator_table_heading_patterns');
} }
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorTableHeaderPatterns(): array public function exactSelectionIndicatorTableHeaderPatterns(): array
{ {
return $this->requiredStringList('exact_selection_indicator_table_header_patterns'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_table_header_patterns')
?: $this->requiredStringList('exact_selection_indicator_table_header_patterns');
} }
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorTableRowPatterns(): array public function exactSelectionIndicatorTableRowPatterns(): array
{ {
return $this->requiredStringList('exact_selection_indicator_table_row_patterns'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_table_row_patterns')
?: $this->requiredStringList('exact_selection_indicator_table_row_patterns');
} }
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorTableRequiredPrimaryTerms(): array public function exactSelectionIndicatorTableRequiredPrimaryTerms(): array
{ {
return $this->requiredStringList('exact_selection_indicator_table_required_primary_terms'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_table_required_primary_terms')
?: $this->requiredStringList('exact_selection_indicator_table_required_primary_terms');
} }
/** @return string[] */ /** @return string[] */
public function exactSelectionIndicatorTableRequiredContextTerms(): array public function exactSelectionIndicatorTableRequiredContextTerms(): array
{ {
return $this->requiredStringList('exact_selection_indicator_table_required_context_terms'); return $this->genreStringList('retrieval_and_language.exact_selection.indicator_table_required_context_terms')
?: $this->requiredStringList('exact_selection_indicator_table_required_context_terms');
} }
/** @return string[] */ /** @return string[] */
@@ -370,6 +379,23 @@ final class NdjsonHybridRetrieverConfig
]; ];
} }
/** @return string[] */
private function genreStringList(string $path): array
{
return $this->genreConfig?->getValueStringList($path) ?? [];
}
/** @return array<string, string[]> */
private function genreStringListMap(string $path): array
{
$value = $this->genreConfig?->getValueArray($path) ?? [];
if ($value === []) {
return [];
}
return $this->normalizeStringListMap($value);
}
private function requiredInt(string $key, int $min = PHP_INT_MIN, ?int $max = null): int private function requiredInt(string $key, int $min = PHP_INT_MIN, ?int $max = null): int
{ {
$value = $this->requiredValue($key); $value = $this->requiredValue($key);
@@ -458,6 +484,35 @@ final class NdjsonHybridRetrieverConfig
return $out; return $out;
} }
/** @return array<string, string[]> */
private function normalizeStringListMap(array $value): array
{
$out = [];
foreach ($value as $mapKey => $items) {
if (!is_string($mapKey) || trim($mapKey) === '' || !is_array($items)) {
continue;
}
$cleanItems = [];
foreach ($items as $item) {
if (!is_scalar($item)) {
continue;
}
$item = trim((string) $item);
if ($item !== '' && !in_array($item, $cleanItems, true)) {
$cleanItems[] = $item;
}
}
if ($cleanItems !== []) {
$out[trim($mapKey)] = $cleanItems;
}
}
return $out;
}
/** /**
* @return array<string, string[]> * @return array<string, string[]>
*/ */

View File

@@ -12,6 +12,7 @@ final class PromptBuilderConfig
public function __construct( public function __construct(
private readonly array $config = [], private readonly array $config = [],
private readonly ?DomainVocabularyConfig $vocabulary = null, private readonly ?DomainVocabularyConfig $vocabulary = null,
private readonly ?GenreConfig $genreConfig = null,
) { ) {
} }
@@ -160,6 +161,12 @@ final class PromptBuilderConfig
return $out; return $out;
} }
/** @return string[] */
private function getGenreStringList(string $path): array
{
return $this->genreConfig?->getValueStringList($path) ?? [];
}
/** /**
* @return string[] * @return string[]
*/ */
@@ -392,7 +399,8 @@ final class PromptBuilderConfig
*/ */
public function getOutputPriorityTechnicalRules(): array public function getOutputPriorityTechnicalRules(): array
{ {
return $this->getRequiredStringList('output_priority.technical_rules'); return $this->getGenreStringList('result_identity_and_answer_policy.prompt_rules.output_priority_technical')
?: $this->getRequiredStringList('output_priority.technical_rules');
} }
public function getFallbackEscalationSectionLabel(): string public function getFallbackEscalationSectionLabel(): string
@@ -468,7 +476,8 @@ final class PromptBuilderConfig
*/ */
public function getResponseFormatTechnicalRules(): array public function getResponseFormatTechnicalRules(): array
{ {
return $this->getRequiredStringList('response_format.technical_rules'); return $this->getGenreStringList('result_identity_and_answer_policy.prompt_rules.response_format_technical')
?: $this->getRequiredStringList('response_format.technical_rules');
} }
/** /**
@@ -476,7 +485,8 @@ final class PromptBuilderConfig
*/ */
public function getResponseFormatAccessoryRules(): array public function getResponseFormatAccessoryRules(): array
{ {
return $this->getRequiredStringList('response_format.accessory_rules'); return $this->getGenreStringList('result_identity_and_answer_policy.prompt_rules.response_format_accessory')
?: $this->getRequiredStringList('response_format.accessory_rules');
} }
public function getLanguageRulesSectionLabel(): string public function getLanguageRulesSectionLabel(): string
@@ -510,7 +520,8 @@ final class PromptBuilderConfig
*/ */
public function getFactGroundingWithShopRules(): array public function getFactGroundingWithShopRules(): array
{ {
return $this->getRequiredStringList('fact_grounding.with_shop_rules'); return $this->getGenreStringList('result_identity_and_answer_policy.prompt_rules.fact_grounding_with_shop')
?: $this->getRequiredStringList('fact_grounding.with_shop_rules');
} }
/** /**
@@ -526,7 +537,8 @@ final class PromptBuilderConfig
*/ */
public function getFactGroundingTechnicalRules(): array public function getFactGroundingTechnicalRules(): array
{ {
return $this->getRequiredStringList('fact_grounding.technical_rules'); return $this->getGenreStringList('result_identity_and_answer_policy.prompt_rules.fact_grounding_technical')
?: $this->getRequiredStringList('fact_grounding.technical_rules');
} }
public function getRetrievedKnowledgeSectionLabel(): string public function getRetrievedKnowledgeSectionLabel(): string

View File

@@ -15,8 +15,10 @@ final class SalesIntentConfig
/** /**
* @param array<string, mixed> $config * @param array<string, mixed> $config
*/ */
public function __construct(private readonly array $config) public function __construct(
{ private readonly array $config,
private readonly ?GenreConfig $genreConfig = null,
) {
} }
public function getDominanceDelta(): int public function getDominanceDelta(): int
@@ -32,31 +34,42 @@ final class SalesIntentConfig
/** @return string[] */ /** @return string[] */
public function getSalesSignals(): array public function getSalesSignals(): array
{ {
return $this->requiredStringList('sales_signals'); return $this->genreStringList('intent_and_routing.sales_intent.sales_signals')
?: $this->requiredStringList('sales_signals');
} }
/** @return string[] */ /** @return string[] */
public function getComparisonSignals(): array public function getComparisonSignals(): array
{ {
return $this->requiredStringList('comparison_signals'); return $this->genreStringList('intent_and_routing.sales_intent.comparison_signals')
?: $this->requiredStringList('comparison_signals');
} }
/** @return string[] */ /** @return string[] */
public function getObjectionSignals(): array public function getObjectionSignals(): array
{ {
return $this->requiredStringList('objection_signals'); return $this->genreStringList('intent_and_routing.sales_intent.objection_signals')
?: $this->requiredStringList('objection_signals');
} }
/** @return string[] */ /** @return string[] */
public function getImplementationSignals(): array public function getImplementationSignals(): array
{ {
return $this->requiredStringList('implementation_signals'); return $this->genreStringList('intent_and_routing.sales_intent.implementation_signals')
?: $this->requiredStringList('implementation_signals');
} }
/** @return string[] */ /** @return string[] */
public function getRoiSignals(): array public function getRoiSignals(): array
{ {
return $this->requiredStringList('roi_signals'); return $this->genreStringList('intent_and_routing.sales_intent.roi_signals')
?: $this->requiredStringList('roi_signals');
}
/** @return string[] */
private function genreStringList(string $path): array
{
return $this->genreConfig?->getValueStringList($path) ?? [];
} }
private function requiredNonNegativeInt(string $key): int private function requiredNonNegativeInt(string $key): int

View File

@@ -22,6 +22,7 @@ final class SearchRepairConfig
private readonly int $minPrimaryResultsWithoutRepair, private readonly int $minPrimaryResultsWithoutRepair,
private readonly array $config, private readonly array $config,
private readonly DomainVocabularyConfig $vocabulary, private readonly DomainVocabularyConfig $vocabulary,
private readonly ?GenreConfig $genreConfig = null,
) { ) {
} }
@@ -52,18 +53,24 @@ final class SearchRepairConfig
public function isDirectProductAttributeLookupRepairEnabled(): bool public function isDirectProductAttributeLookupRepairEnabled(): bool
{ {
return $this->requiredBool('direct_product_attribute_lookup.enabled'); return $this->genreBool('search_repair.direct_product_attribute_lookup.enabled')
?? $this->requiredBool('direct_product_attribute_lookup.enabled');
} }
public function getDirectProductAttributeLookupMinTokens(): int public function getDirectProductAttributeLookupMinTokens(): int
{ {
return $this->requiredPositiveInt('direct_product_attribute_lookup.min_query_tokens_after_cleanup'); $genreValue = $this->genreInt('search_repair.direct_product_attribute_lookup.min_query_tokens_after_cleanup');
return $genreValue !== null && $genreValue > 0
? $genreValue
: $this->requiredPositiveInt('direct_product_attribute_lookup.min_query_tokens_after_cleanup');
} }
/** @return string[] */ /** @return string[] */
public function getDirectProductAttributeLookupProductTypeTerms(): array public function getDirectProductAttributeLookupProductTypeTerms(): array
{ {
return $this->configOrVocabularyStringList( return $this->genreStringList('product_attributes.direct_attribute_cleanup.product_type_terms')
?: $this->configOrVocabularyStringList(
'direct_product_attribute_lookup.product_type_terms', 'direct_product_attribute_lookup.product_type_terms',
'search_repair.direct_product_type_terms' 'search_repair.direct_product_type_terms'
); );
@@ -72,7 +79,8 @@ final class SearchRepairConfig
/** @return string[] */ /** @return string[] */
public function getDirectProductAttributeLookupStopTerms(): array public function getDirectProductAttributeLookupStopTerms(): array
{ {
return $this->configOrVocabularyStringList( return $this->genreStringList('product_attributes.direct_attribute_cleanup.stop_terms')
?: $this->configOrVocabularyStringList(
'direct_product_attribute_lookup.stop_terms', 'direct_product_attribute_lookup.stop_terms',
'search_repair.direct_product_attribute_stop_terms' 'search_repair.direct_product_attribute_stop_terms'
); );
@@ -81,7 +89,8 @@ final class SearchRepairConfig
/** @return string[] */ /** @return string[] */
public function getDirectProductAttributeLookupComparativeConstraintPatterns(): array public function getDirectProductAttributeLookupComparativeConstraintPatterns(): array
{ {
return $this->requiredStringList('direct_product_attribute_lookup.comparative_constraint_patterns'); return $this->genreStringList('product_attributes.direct_attribute_cleanup.comparative_constraint_patterns')
?: $this->requiredStringList('direct_product_attribute_lookup.comparative_constraint_patterns');
} }
/** @return string[] */ /** @return string[] */
@@ -116,7 +125,8 @@ final class SearchRepairConfig
/** @return string[] */ /** @return string[] */
public function getSpecificModelCandidatePatterns(): array public function getSpecificModelCandidatePatterns(): array
{ {
return $this->requiredStringList('specific_model_candidate_patterns'); return $this->genreStringList('search_repair.candidate_patterns.specific_model_candidate_patterns')
?: $this->requiredStringList('specific_model_candidate_patterns');
} }
/** @return string[] */ /** @return string[] */
@@ -135,40 +145,46 @@ final class SearchRepairConfig
public function getModelCandidatePattern(): string public function getModelCandidatePattern(): string
{ {
return $this->requiredString('patterns.model_candidate'); return $this->genreString('search_repair.candidate_patterns.patterns.model_candidate')
?: $this->requiredString('patterns.model_candidate');
} }
public function getAccessoryCandidatePattern(): string public function getAccessoryCandidatePattern(): string
{ {
return $this->renderPatternTemplate( return $this->renderPatternTemplate(
'patterns.accessory_candidate_template', 'patterns.accessory_candidate_template',
['terms' => $this->patternAlternation($this->getAccessoryCandidateTerms())] ['terms' => $this->patternAlternation($this->getAccessoryCandidateTerms())],
'search_repair.candidate_patterns.patterns.accessory_candidate_template'
); );
} }
public function getRequestedAccessoryCodePattern(): string public function getRequestedAccessoryCodePattern(): string
{ {
return $this->requiredString('patterns.requested_accessory_code'); return $this->genreString('search_repair.candidate_patterns.patterns.requested_accessory_code')
?: $this->requiredString('patterns.requested_accessory_code');
} }
public function getAccessoryOrBundlePattern(): string public function getAccessoryOrBundlePattern(): string
{ {
return $this->renderPatternTemplate( return $this->renderPatternTemplate(
'patterns.accessory_or_bundle_template', 'patterns.accessory_or_bundle_template',
['terms' => $this->patternAlternation($this->getAccessoryOrBundleTerms())] ['terms' => $this->patternAlternation($this->getAccessoryOrBundleTerms())],
'search_repair.candidate_patterns.patterns.accessory_or_bundle_template'
); );
} }
public function getModelLikePattern(): string public function getModelLikePattern(): string
{ {
return $this->requiredString('patterns.model_like'); return $this->genreString('search_repair.candidate_patterns.patterns.model_like')
?: $this->requiredString('patterns.model_like');
} }
public function getSpecificityBoostPattern(): string public function getSpecificityBoostPattern(): string
{ {
return $this->renderPatternTemplate( return $this->renderPatternTemplate(
'patterns.specificity_boost_template', 'patterns.specificity_boost_template',
['terms' => $this->patternAlternation($this->getSpecificityBoostTerms())] ['terms' => $this->patternAlternation($this->getSpecificityBoostTerms())],
'search_repair.candidate_patterns.patterns.specificity_boost_template'
); );
} }
@@ -286,6 +302,27 @@ final class SearchRepairConfig
); );
} }
/** @return string[] */
private function genreStringList(string $path): array
{
return $this->genreConfig?->getValueStringList($path) ?? [];
}
private function genreString(string $path): string
{
return $this->genreConfig?->getValueString($path) ?? '';
}
private function genreBool(string $path): ?bool
{
return $this->genreConfig?->getValueBool($path);
}
private function genreInt(string $path): ?int
{
return $this->genreConfig?->getValueInt($path);
}
/** @return string[] */ /** @return string[] */
private function configOrVocabularyStringList(string $configKey, string $vocabularyPath): array private function configOrVocabularyStringList(string $configKey, string $vocabularyPath): array
{ {
@@ -305,9 +342,12 @@ final class SearchRepairConfig
} }
/** @param array<string, string> $variables */ /** @param array<string, string> $variables */
private function renderPatternTemplate(string $path, array $variables): string private function renderPatternTemplate(string $path, array $variables, ?string $genrePath = null): string
{ {
$template = $genrePath !== null ? $this->genreString($genrePath) : '';
if ($template === '') {
$template = $this->requiredString($path); $template = $this->requiredString($path);
}
foreach ($variables as $key => $value) { foreach ($variables as $key => $value) {
$template = str_replace('{' . $key . '}', $value, $template); $template = str_replace('{' . $key . '}', $value, $template);