# RetrieX Developer Policies

Status: binding after completion of the YAML-only migration through Patch 11.0a.

These policies protect the stable RetrieX configuration architecture. They are intentionally operational and must be followed by developers before a change is merged.

## 1. Source of truth

The functional configuration source of truth is YAML under `config/retriex/`.

This applies especially to:

- vocabulary, synonyms, stopwords and token lists
- intent rules and commerce routing
- search repair and query enrichment rules
- prompt texts, labels, output-priority rules and grounding rules
- agent messages, source labels, status messages and templates
- retrieval thresholds, token groups and scoring-related rule lists
- shop matching and commerce query parsing rules
- model and index defaults

PHP classes may read and validate these values, but must not silently redefine them.

## 2. No new PHP-only defaults

New configurable values must be added in YAML first.

Required pattern:

1. Add the value to the matching file in `config/retriex/`.
2. Wire the value through `config/services.yaml` when constructor injection is needed.
3. Expose/read it through the matching `src/Config/*Config.php` class.
4. Keep `mto:agent:config:validate` green.
5. Keep `mto:agent:config:audit-source --details` free of missing YAML mappings.

Not allowed:

- new constructor defaults that act as business or answer logic
- new hardcoded keyword lists in `src/`
- new PHP-only constants for semantic or product-specific behavior
- hidden fallbacks that change retrieval, prompt, shop or intent behavior when YAML is incomplete

## 3. Allowed technical constants

Technical constants are allowed only when they are not business, prompt, retrieval, product, intent or shop semantics.

Examples that may be acceptable:

- internal status strings
- command exit handling
- filesystem mode details
- non-semantic implementation identifiers

If a constant influences answer quality, matching, routing, scoring, prompt behavior or shop behavior, it belongs in YAML.

## 4. Fallback policy

Fallbacks are not a normal extension mechanism.

A fallback is only acceptable when all conditions are met:

- the value has a YAML path or explicit service-parameter mapping
- the fallback is documented as defensive infrastructure behavior
- the audit does not report it as missing YAML mapping
- the fallback cannot change semantic answer behavior in normal operation

If in doubt, move the value to YAML.

## 5. Required checks before merge

Every change touching `src/Config`, `config/retriex`, prompt, retrieval, intent, commerce, shop matching, SSE/job completion or answer grounding must run:

```bash
php bin/console cache:clear
php bin/console mto:agent:config:validate
php bin/console mto:agent:config:audit-source --details
php bin/console mto:agent:regression:test
```

Expected result:

- config validation: OK
- regression baseline: OK
- source audit: no missing YAML mappings
- no new undocumented PHP-only semantic constants
- no new constructor defaults without YAML/service-parameter mapping

## 6. Protected regression baseline

The following behavior must not regress:

- lowest water-hardness limit remains `0,02 deg dH` for Testomat 808
- follow-up indicator answer remains focused on indicator type 300
- accessory price follow-up for indicator type 300 returns the matching indicator products, not device prices
- history-based shop follow-up such as `suche im shop` keeps the relevant product context
- advisory product questions may use shop/catalog fallback when RAG knowledge is insufficient
- SSE/job completion must close loader/think states reliably, including reconnect/watchdog cases

Offline checks are covered by `mto:agent:regression:test`. End-to-end behavior still needs manual or integration verification when the touched code path is not covered offline.

## 7. Strict YAML validation

Strict YAML validation remains intentionally deferred.

Until a later patch explicitly enables it, developers must enforce these policies through:

- code review
- `mto:agent:config:validate`
- `mto:agent:config:audit-source --details`
- `mto:agent:regression:test`

Strict mode must remain configurable and disabled by default when it is introduced later.

## 8. Pull request checklist

Use this checklist for every relevant PR:

- [ ] All new configurable behavior is in `config/retriex/*.yaml`.
- [ ] No new semantic keyword/token/prompt list was added directly to PHP.
- [ ] No new constructor default was added without YAML/service-parameter mapping.
- [ ] `mto:agent:config:validate` is OK.
- [ ] `mto:agent:config:audit-source --details` has no missing YAML mappings.
- [ ] `mto:agent:regression:test` is OK.
- [ ] The protected functional flows were manually checked if the touched area can affect them.
- [ ] README or patch README documents the reason for any intentionally accepted technical fallback.
## 9. Language cleanup ownership

Generic language cleanup must use `config/retriex/language.yaml` and its cleanup profiles.

Rules:

- add generic German stopwords to `stopword_groups`, not to domain YAML files
- add user wording such as `ich suche`, `zeige mir` or `habt ihr` to `phrase_groups`
- add table/list/overview wording to `meta_term_groups`
- keep commerce intent, product-role, measurement and routing terms in their owning domain YAML
- never remove protected terms such as `nicht`, `kein`, `testomat`, `indikator`, `ph`, `rx`, `th`, `tc` or `0,02` through generic cleanup
- prefer `cleanup_profile: ...` references over copied token lists

See `RETRIEX_LANGUAGE_CLEANUP_GUIDE.md` for the detailed ownership rules.