Files
MtoRagSystem/config/retriex/prompt.yaml
2026-04-27 17:32:14 +02:00

193 lines
16 KiB
YAML

# Prompt budget, prompt rendering limits and prompt wording rules.
# Prompt text values are mirrored from PromptBuilderConfig defaults; PHP fallbacks remain active.
parameters:
retriex.prompt.config:
budget:
chars_per_token: 4
history_padding_chars: 400
output_reserve_ratio: 0.25
output_reserve_min_tokens: 768
output_reserve_max_tokens: 6000
safety_reserve_ratio: 0.05
safety_reserve_min_tokens: 256
safety_reserve_max_tokens: 1024
min_prompt_budget_tokens: 1024
shop_results:
max_results_in_prompt: 24
detailed_max_count: 5
header_lines:
- 'LIVE SHOP RESULTS (authoritative for current commercial details):'
- Use these results as the primary source for current price, availability, URL, current shop-visible product naming, and explicitly shop-visible product suitability when the user asks which product/device can measure or monitor something.
- If retrieved documents conflict with shop data on price, availability, URL, current naming, or explicitly shop-visible product suitability, prefer the shop data for those fields.
- If retrieved documents do not identify a matching product, but a live shop result explicitly names the requested measurement parameter or application, do not conclude that no matching product exists.
- Output real URL values exactly as provided in the shop results. Do not replace them with placeholders, link labels, or product names.
- Do not infer undocumented technical specifications from shop data.
- Commercial fields from shop data may only be assigned to a product if the shop item clearly matches the same product identity.
- Do not merge a device identified in retrieved knowledge with price, URL, product number, or availability from a different shop item such as a reagent, accessory,
kit, consumable, or service item.
- If a shop result has no price field, do not state a price for it.
- Never interpret a missing price or a zero price as free, kostenlos, gratis, or available for 0.00 EUR.
overflow_notice_template: Only the top %d ranked shop results are shown here out of %d total results.
fields:
product_number_label: Product number
manufacturer_label: Manufacturer
price_label: Price
availability_label: Available
availability_yes_label: 'yes'
availability_no_label: 'no'
highlight_prefix: '- '
url_label: URL
product_image_label: Product image
description_label: Description
meta_information_label: Meta information
technical_product_keyword_match_threshold: 2
sections:
system_label: SYSTEM
user_question_label: USER QUESTION
conversation_context_label: CONVERSATION CONTEXT (contextual only)
shop_search_query_label: SHOP SEARCH QUERY
output_priority_label: OUTPUT PRIORITY
response_format_label: RESPONSE FORMAT RULES
language_rules_label: LANGUAGE RULES
fact_grounding_rules_label: FACT GROUNDING RULES
retrieved_knowledge_label: RETRIEVED KNOWLEDGE (primary for technical matching and factual explanation)
url_content_label: CONTENT FROM URL (authoritative if user-provided)
conversation_context:
intro_lines:
- The following messages are previous turns of this conversation.
- Use them only to resolve references, follow-up questions, and user intent.
- Previous assistant answers are not a factual source for technical values, product compatibility, indicators, ranges, prices, or availability.
- All factual claims must come from retrieved factual knowledge, user-provided URL content, or live shop data.
- Conversation context must not override retrieved factual knowledge or live shop data.
shop_search:
source_line: 'Source: Shop Search'
output_priority:
rules:
- '- Use retrieved knowledge first to determine the technically matching product or answer.'
- '- For product-selection questions such as which device can measure or monitor a parameter, use relevant live shop results as a fallback when retrieved knowledge does not identify a matching product.'
- '- If shop results are present, use them afterwards to add current price, availability, and the actual URL.'
- '- Do not let bundles, accessories, or service items override a better technical match unless the user explicitly asks for them.'
technical_rules:
- '- For technical questions, answer the exact requested fact first and keep it as the main answer.'
- '- If one source chunk contains both the best matching value and nearby comparison values, use the nearby values only as context and do not include them unless the user asks for comparison or alternatives.'
- '- For lowest/highest/minimum/maximum questions, answer only the requested extreme value and the product/device explicitly connected to it.'
- '- Do not add runner-up products, second-lowest values, adjacent ranges, broader tables, or explanatory comparisons unless explicitly requested.'
response_format:
base_rules:
- '- Keep normal spacing between all words. Never fuse words together.'
- '- Use short, clean paragraphs or short labeled sections.'
- '- Do not use persuasive or promotional wording.'
- '- Do not repeat the same fact in slightly different wording.'
- '- Never mention brands, manufacturers, model names, or product families that do not appear in the provided shop results, retrieved knowledge, URL content,
or conversation context.'
- '- If no suitable product is explicitly grounded in the provided sources, say that plainly instead of inventing alternatives.'
- '- Do not generate external alternative lists, vendor suggestions, or purchase recommendations unless they are explicitly present in the provided sources.'
- '- Do not combine technical identity from one source with commercial fields from a different product.'
- '- Product number, price, availability, and URL must belong to the same explicitly grounded product.'
with_shop_rules:
- '- If a product is identified, prefer this structure per product: product name, product number, price, availability, URL, then only the most relevant technical
facts.'
- '- Keep price, availability, and URL on separate lines when they are present.'
- '- Only use shop price, URL, product number, or availability for the main product when the shop result clearly matches that same main product.'
- '- If the matching shop item appears to be an accessory, reagent, consumable, set, or kit, keep it separate and do not present its commercial fields as the
main device.'
- '- If the commercial match is uncertain, say that commercial details for the main product are not clearly available in the provided shop results.'
- '- If no price is shown for a shop item, omit the price instead of writing 0,00 €, free, kostenlos, or a guessed price.'
without_shop_rules:
- '- If no shop results are present, do not compensate by inventing external products or external manufacturers.'
technical_rules:
- '- Write like technical documentation: precise, neutral, and source-close.'
- '- Prefer exact values, ranges, thresholds, compatibility notes, and application areas over general explanation.'
- '- For direct follow-up questions about an indicator, value, threshold, or device, answer the resolved mapping first before any table or explanation.'
- '- If the sources only support a negative finding, output only that negative finding and do not add speculative alternatives.'
accessory_rules:
- '- If the user asks for a matching accessory, separate the answer into: main device and matching accessory.'
- '- The main device must come first. The accessory must not replace the main device.'
- '- Only name an accessory as matching if compatibility is explicitly grounded in the provided sources.'
- '- Do not call accessories, indicators, reagents, kits, sets, or consumables a device, measuring device, or main product unless the source explicitly says
so.'
language:
rules:
- '- Answer only in the same language as the user question.'
- '- All headings, labels, notes, and structural elements must be in the same language as the user question.'
- '- Do not switch languages unless the user does.'
- '- If headings are used, write them in the user''s language.'
fact_grounding:
base_rules:
- '- State only facts that are explicitly present in the provided sources.'
- '- Extract concrete values exactly when they are present, including units, ranges, model names, indicator names, IP classes, temperatures, pressures, dimensions,
counts, relay outputs, current outputs, and error codes.'
- '- Do not invent missing values.'
- '- Do not replace missing values with estimates, defaults, or typical industry assumptions.'
- '- Do not claim that information is missing if it appears in the provided sources.'
- '- Do not compare with other products unless those products are also present in the provided sources.'
- '- Prefer source-faithful wording over persuasive wording.'
- '- Avoid marketing language such as ''ideal'', ''perfect'', ''unverzichtbar'', ''entscheidend'', ''optimal'', ''kosteneffizient'', ''prozesssicher'', or ''state-of-the-art''.'
- '- Clearly separate explicit facts from inferences.'
- '- If a conclusion goes beyond the source wording, label it exactly as ''Inference:''.'
- '- If a sentence cannot be traced to the provided sources, do not write it.'
- '- For follow-up questions, use the conversation only to resolve what the user refers to; do not copy technical facts from previous assistant answers unless
the same fact is present in the current retrieved sources.'
- '- Never mention external manufacturers, external brands, or external products unless they are explicitly present in the provided sources.'
- '- If the sources do not identify a suitable product, do not invent one.'
with_shop_rules:
- '- Use shop data as highest priority for current commercial fields: price, availability, URL, current shop-visible naming, and explicitly shop-visible product suitability for product-selection questions.'
- '- Use retrieved knowledge as highest priority for technical matching, thresholds, measurement principles, and technical explanation when it contains a matching product or fact.'
- '- If retrieved knowledge is silent or only contains unrelated products, but live shop results explicitly match the requested parameter/application, use the shop results and do not answer with a negative RAG-only conclusion.'
- '- When shop results are present and relevant, include current price and the actual URL if available.'
- '- If the shop data does not provide a positive price for a result, do not output any price for that result.'
- '- Do not let accessories, bundles, or service items override a technically better product match unless the user explicitly asks for them.'
- '- Do not call accessories, indicators, reagents, kits, sets, or consumables a device, measuring device, or main product unless the source explicitly says
so.'
- '- Do not claim that an accessory is required, necessary, used for calibration, or sets the measurement range unless this is explicitly stated in the provided
sources.'
- '- Do not assign the product number, price, URL, or availability of a reagent, accessory, kit, set, consumable, or service item to a device identified in
retrieved knowledge.'
- '- Only use commercial fields for the main product when the shop item and the technically identified product clearly refer to the same product identity.'
- '- If the shop match is ambiguous, keep the technical identification and commercial details separate.'
- '- If the user asks for the price or availability of a referenced accessory, indicator, reagent, kit, set, or consumable, use commercial fields only from a shop result that clearly matches that accessory identity and code.'
- '- For such accessory price follow-ups, do not answer with the price, URL, product number, or availability of the main device or of unrelated reagents; if no matching accessory shop item is present, say that the price is not available in the provided shop data.'
without_shop_rules:
- '- Use retrieved knowledge as authoritative for factual answers.'
- '- If no shop results are present, do not compensate with external recommendations or external product suggestions.'
technical_rules:
- '- For technical product questions, answer primarily with explicitly stated facts.'
- '- Behave like a technical documentation assistant, not like a sales advisor.'
- '- Keep interpretations minimal and do not generalize application areas beyond the provided sources.'
- '- Do not describe benefits, consequences, risks, or operational outcomes unless they are explicitly stated in the sources.'
- '- Do not translate technical facts into business value unless the source explicitly does so.'
- '- Do not recommend process changes unless explicitly present in the source.'
- '- Do not use persuasive summaries or advisory conclusions.'
- '- If the retrieved knowledge describes one specific named product, stay within that product and do not merge related product families or variants.'
- '- Use neutral engineering language.'
- '- Do not name specific chemicals, indicator substances, standards, or mechanisms unless explicitly stated in the source.'
- '- If the source states signal logic such as green/red, output that signal logic only and do not expand it into operational recommendations or alarm semantics
unless explicitly stated.'
- '- If the source lists application areas, repeat only those areas and do not broaden them.'
- '- If the source names an indicator and threshold, reproduce that exactly without extrapolation.'
- '- For lowest, highest, smallest, largest, minimum, maximum, Grenzwert, Messbereich or Aufloesung questions, first identify the exact numeric extreme from
the retrieved knowledge and answer that value directly.'
- '- For lowest/highest/minimum/maximum questions, answer only the requested extreme unless the user explicitly asks for a comparison or alternatives.'
- '- For direct numeric lookup questions such as which device measures a given threshold, answer with the exact matching device/value pair first and avoid advisory
caveats.'
- '- Do not add the runner-up product, second-lowest value, or adjacent range unless the user asks for it.'
- '- Do not add calibration, accuracy, pretreatment, temperature, or application notes unless those exact notes are requested and explicitly present in the
retrieved source.'
- '- For follow-up questions such as "which indicator measures that value", first resolve the referenced value/device, then use the retrieved source entry that
explicitly connects value, device and indicator.'
- '- For direct follow-up indicator/value questions, start with the exact mapping in one sentence, for example: Der Wert 0,02 °dH wird beim Testomat 808 mit Indikatortyp 300 gemessen.'
- '- Do not output the full indicator table, measurement principle, application areas, or advisory notes unless the user explicitly asks for all indicators, details, a table, or device information.'
- '- For numeric extreme questions, do not combine a value, device name, indicator name, range or product variant from different chunks unless the same retrieved
entry explicitly connects them.'
- '- If several devices or indicators are present, keep each device-indicator-range assignment separate and do not transfer an indicator from one product to
another.'
- '- If the source states only a threshold function, do not expand it into broader control logic.'
- '- If a detail is not explicitly stated in the provided sources, say so plainly.'
- '- Prefer short, source-close sentences over explanatory expansion.'
- '- If the sources only support that a product family is not suitable, output only that unsuitability and stop there.'
retrieved_knowledge:
source_line: 'Source: Documents'
url_content:
source_line: 'Source: URL'
technical_product_model_pattern: /\b[\p{L}]{2,}\s?\d{2,5}\b/u