# RetrieX Developer Policies Status: binding after completion of the YAML-only migration through Patch 11.0a. These policies protect the stable RetrieX configuration architecture. They are intentionally operational and must be followed by developers before a change is merged. ## 1. Source of truth The functional configuration source of truth is YAML under `config/retriex/`. This applies especially to: - vocabulary, synonyms, stopwords and token lists - intent rules and commerce routing - search repair and query enrichment rules - prompt texts, labels, output-priority rules and grounding rules - agent messages, source labels, status messages and templates - retrieval thresholds, token groups and scoring-related rule lists - shop matching and commerce query parsing rules - model and index defaults PHP classes may read and validate these values, but must not silently redefine them. ## 2. No new PHP-only defaults New configurable values must be added in YAML first. Required pattern: 1. Add the value to the matching file in `config/retriex/`. 2. Wire the value through `config/services.yaml` when constructor injection is needed. 3. Expose/read it through the matching `src/Config/*Config.php` class. 4. Keep `mto:agent:config:validate` green. 5. Keep `mto:agent:config:audit-source --details` free of missing YAML mappings. Not allowed: - new constructor defaults that act as business or answer logic - new hardcoded keyword lists in `src/` - new PHP-only constants for semantic or product-specific behavior - hidden fallbacks that change retrieval, prompt, shop or intent behavior when YAML is incomplete ## 3. Allowed technical constants Technical constants are allowed only when they are not business, prompt, retrieval, product, intent or shop semantics. Examples that may be acceptable: - internal status strings - command exit handling - filesystem mode details - non-semantic implementation identifiers If a constant influences answer quality, matching, routing, scoring, prompt behavior or shop behavior, it belongs in YAML. ## 4. Fallback policy Fallbacks are not a normal extension mechanism. A fallback is only acceptable when all conditions are met: - the value has a YAML path or explicit service-parameter mapping - the fallback is documented as defensive infrastructure behavior - the audit does not report it as missing YAML mapping - the fallback cannot change semantic answer behavior in normal operation If in doubt, move the value to YAML. ## 5. Required checks before merge Every change touching `src/Config`, `config/retriex`, prompt, retrieval, intent, commerce, shop matching, SSE/job completion or answer grounding must run: ```bash php bin/console cache:clear php bin/console mto:agent:config:validate php bin/console mto:agent:config:audit-source --details php bin/console mto:agent:regression:test ``` Expected result: - config validation: OK - regression baseline: OK - source audit: no missing YAML mappings - no new undocumented PHP-only semantic constants - no new constructor defaults without YAML/service-parameter mapping ## 6. Protected regression baseline The following behavior must not regress: - lowest water-hardness limit remains `0,02 deg dH` for Testomat 808 - follow-up indicator answer remains focused on indicator type 300 - accessory price follow-up for indicator type 300 returns the matching indicator products, not device prices - history-based shop follow-up such as `suche im shop` keeps the relevant product context - advisory product questions may use shop/catalog fallback when RAG knowledge is insufficient - SSE/job completion must close loader/think states reliably, including reconnect/watchdog cases Offline checks are covered by `mto:agent:regression:test`. End-to-end behavior still needs manual or integration verification when the touched code path is not covered offline. ## 7. Strict YAML validation Strict YAML validation remains intentionally deferred. Until a later patch explicitly enables it, developers must enforce these policies through: - code review - `mto:agent:config:validate` - `mto:agent:config:audit-source --details` - `mto:agent:regression:test` Strict mode must remain configurable and disabled by default when it is introduced later. ## 8. Pull request checklist Use this checklist for every relevant PR: - [ ] All new configurable behavior is in `config/retriex/*.yaml`. - [ ] No new semantic keyword/token/prompt list was added directly to PHP. - [ ] No new constructor default was added without YAML/service-parameter mapping. - [ ] `mto:agent:config:validate` is OK. - [ ] `mto:agent:config:audit-source --details` has no missing YAML mappings. - [ ] `mto:agent:regression:test` is OK. - [ ] The protected functional flows were manually checked if the touched area can affect them. - [ ] README or patch README documents the reason for any intentionally accepted technical fallback. ## 9. Language cleanup ownership Generic language cleanup must use `config/retriex/language.yaml` and its cleanup profiles. Rules: - add generic German stopwords to `stopword_groups`, not to domain YAML files - add user wording such as `ich suche`, `zeige mir` or `habt ihr` to `phrase_groups` - add table/list/overview wording to `meta_term_groups` - keep commerce intent, product-role, measurement and routing terms in their owning domain YAML - never remove protected terms such as `nicht`, `kein`, `testomat`, `indikator`, `ph`, `rx`, `th`, `tc` or `0,02` through generic cleanup - prefer `cleanup_profile: ...` references over copied token lists See `RETRIEX_LANGUAGE_CLEANUP_GUIDE.md` for the detailed ownership rules.