Files
MtoRagSystem/RETRIEX_SSE_JOB_HARDENING_FIX_README.md
2026-04-26 13:09:01 +02:00

1.8 KiB

RetrieX SSE Job Hardening Fix

Patch-only fix for the browser streaming job lifecycle.

Problem

/ask-sse/{jobId} deleted the stream job immediately when the first EventSource connection started. If the browser, WLAN, router, proxy or PHP/Nginx connection briefly dropped, EventSource tried to reconnect with the same job id. The job file was already gone, so the user saw:

Der Antwort-Job ist abgelaufen oder wurde nicht gefunden. Bitte sende die Anfrage erneut.

This made normal network interruptions look like an expired job.

Change

src/Controller/AskSseController.php now keeps the job file for the configured TTL and uses explicit job states:

  • pending
  • running
  • completed
  • interrupted
  • failed

The stream endpoint atomically claims a pending job under a file lock instead of deleting it immediately. Reconnects or duplicate opens no longer see a missing job; they receive a more accurate message depending on the stored state.

Runtime behavior

  • A new job is created as pending.
  • The first /ask-sse/{jobId} request claims it as running.
  • Successful completion marks it as completed.
  • Browser/client connection abort marks it as interrupted.
  • Stream exceptions or fatal shutdown errors mark it as failed.
  • Old job files are still cleaned by JOB_TTL_SECONDS.

Safety

This patch does not change Retrieval, PromptBuilder, AgentRunner, Shopware, Intent, Vocabulary, scoring or RAG behavior. It only hardens the SSE job lifecycle and improves user-facing error messages for reconnect/network cases.

After applying

Run:

php bin/console cache:clear
php bin/console mto:agent:config:validate
php bin/console mto:agent:regression:test

Then test in the browser with a normal prompt and, if possible, simulate a short network interruption during streaming.