5.5 KiB
RetrieX Developer Policies
Status: binding after completion of the YAML-only migration through Patch 11.0a.
These policies protect the stable RetrieX configuration architecture. They are intentionally operational and must be followed by developers before a change is merged.
1. Source of truth
The functional configuration source of truth is YAML under config/retriex/.
This applies especially to:
- vocabulary, synonyms, stopwords and token lists
- intent rules and commerce routing
- search repair and query enrichment rules
- prompt texts, labels, output-priority rules and grounding rules
- agent messages, source labels, status messages and templates
- retrieval thresholds, token groups and scoring-related rule lists
- shop matching and commerce query parsing rules
- model and index defaults
PHP classes may read and validate these values, but must not silently redefine them.
2. No new PHP-only defaults
New configurable values must be added in YAML first.
Required pattern:
- Add the value to the matching file in
config/retriex/. - Wire the value through
config/services.yamlwhen constructor injection is needed. - Expose/read it through the matching
src/Config/*Config.phpclass. - Keep
mto:agent:config:validategreen. - Keep
mto:agent:config:audit-source --detailsfree of missing YAML mappings.
Not allowed:
- new constructor defaults that act as business or answer logic
- new hardcoded keyword lists in
src/ - new PHP-only constants for semantic or product-specific behavior
- hidden fallbacks that change retrieval, prompt, shop or intent behavior when YAML is incomplete
3. Allowed technical constants
Technical constants are allowed only when they are not business, prompt, retrieval, product, intent or shop semantics.
Examples that may be acceptable:
- internal status strings
- command exit handling
- filesystem mode details
- non-semantic implementation identifiers
If a constant influences answer quality, matching, routing, scoring, prompt behavior or shop behavior, it belongs in YAML.
4. Fallback policy
Fallbacks are not a normal extension mechanism.
A fallback is only acceptable when all conditions are met:
- the value has a YAML path or explicit service-parameter mapping
- the fallback is documented as defensive infrastructure behavior
- the audit does not report it as missing YAML mapping
- the fallback cannot change semantic answer behavior in normal operation
If in doubt, move the value to YAML.
5. Required checks before merge
Every change touching src/Config, config/retriex, prompt, retrieval, intent, commerce, shop matching, SSE/job completion or answer grounding must run:
php bin/console cache:clear
php bin/console mto:agent:config:validate
php bin/console mto:agent:config:audit-source --details
php bin/console mto:agent:regression:test
Expected result:
- config validation: OK
- regression baseline: OK
- source audit: no missing YAML mappings
- no new undocumented PHP-only semantic constants
- no new constructor defaults without YAML/service-parameter mapping
6. Protected regression baseline
The following behavior must not regress:
- lowest water-hardness limit remains
0,02 deg dHfor Testomat 808 - follow-up indicator answer remains focused on indicator type 300
- accessory price follow-up for indicator type 300 returns the matching indicator products, not device prices
- history-based shop follow-up such as
suche im shopkeeps the relevant product context - advisory product questions may use shop/catalog fallback when RAG knowledge is insufficient
- SSE/job completion must close loader/think states reliably, including reconnect/watchdog cases
Offline checks are covered by mto:agent:regression:test. End-to-end behavior still needs manual or integration verification when the touched code path is not covered offline.
7. Strict YAML validation
Strict YAML validation remains intentionally deferred.
Until a later patch explicitly enables it, developers must enforce these policies through:
- code review
mto:agent:config:validatemto:agent:config:audit-source --detailsmto:agent:regression:test
Strict mode must remain configurable and disabled by default when it is introduced later.
8. Pull request checklist
Use this checklist for every relevant PR:
- All new configurable behavior is in
config/retriex/*.yaml. - No new semantic keyword/token/prompt list was added directly to PHP.
- No new constructor default was added without YAML/service-parameter mapping.
mto:agent:config:validateis OK.mto:agent:config:audit-source --detailshas no missing YAML mappings.mto:agent:regression:testis OK.- The protected functional flows were manually checked if the touched area can affect them.
- README or patch README documents the reason for any intentionally accepted technical fallback.
9. Language cleanup ownership
Generic language cleanup must use config/retriex/language.yaml and its cleanup profiles.
Rules:
- add generic German stopwords to
stopword_groups, not to domain YAML files - add user wording such as
ich suche,zeige mirorhabt ihrtophrase_groups - add table/list/overview wording to
meta_term_groups - keep commerce intent, product-role, measurement and routing terms in their owning domain YAML
- never remove protected terms such as
nicht,kein,testomat,indikator,ph,rx,th,tcor0,02through generic cleanup - prefer
cleanup_profile: ...references over copied token lists
See RETRIEX_LANGUAGE_CLEANUP_GUIDE.md for the detailed ownership rules.