harden code and add messenger services and ne README.md and SYSTEM.,d

This commit is contained in:
team 1
2026-02-15 14:36:04 +01:00
parent 993531b268
commit 5b100039e0
8 changed files with 865 additions and 215 deletions

604
README.md
View File

@@ -1,250 +1,430 @@
# mitho AI Agent (Alpha Version)
**Hybrid RAG System auf Symfony-Basis mit Vektor- & Keyword-Retrieval**
# mitho AI Agent Developer Deep Dive
Enterprise Hybrid RAG System (Symfony + NDJSON + FAISS)
---
## Überblick
# 1. System Overview
Der **mitho AI Agent** ist ein produktionsreifes, Symfony-basiertes RAG-System (Retrieval Augmented Generation), das KI nicht frei „raten“ lässt, sondern Antworten strikt auf Basis eines kontrollierten Wissenspools erzeugt.
This system implements a deterministic, governance-stable Retrieval Augmented Generation (RAG) architecture based on:
> **Leitsatz:**
> *„Wir nutzen KI nicht, um kreativ zu raten, sondern um verlässlich auf Basis Ihres Wissens zu antworten.“*
- Symfony (PHP backend)
- NDJSON-based knowledge index
- Full FAISS vector rebuild strategy
- Hybrid retrieval (keyword + vector)
- Deterministic ingest pipeline
- Strict versioning & guardrails
- Lock-based reindex protection
Das System kombiniert:
- Large Language Model (LLM, z. B. Qwen via Ollama)
- Keyword-basiertes Retrieval
- FAISS-Vektor-Suche
- Versionierte Wissensstruktur (Chunks + Index)
- Streaming-Ausgabe via Server-Sent Events (SSE)
- Persistente Chat-Historie pro Client
No incremental vector mutation is allowed.
FAISS is always rebuilt from `index.ndjson`.
---
# Architektur
# 2. High-Level Architecture
## 1. Backend
User Query
→ Hybrid Retrieval
→ Context Assembly
→ Prompt Builder
→ LLM
→ Streaming Response (SSE)
**Technologie**
- PHP 8.2+
- Symfony 7.4
- Monolog Logging
- Symfony Cache
- Session Support
### Zentrale Komponenten
| Komponente | Aufgabe |
|------------|----------|
| `AgentRunner` | Orchestriert Prompt, Kontext & LLM |
| `PromptBuilder` | Baut System- & User-Prompt |
| `ContextService` | Historienverwaltung |
| `ChunkKeywordRetriever` | Keyword-Scoring |
| `VectorSearchClient` | Python-FAISS-Anbindung |
| `KnowledgeIngestService` | Dokument → Chunks |
| `ChunkIndexWriter` | index.json Verwaltung |
| `CachedRetriever` | Performance-Optimierung |
Knowledge Flow:
Document → Version → Extract → Chunk → NDJSON → FAISS → Retrieval
---
## 2. Hybrid Retrieval (Produktionsarchitektur)
Das System nutzt eine **hybride Sucharchitektur**:
### A) Keyword-Retrieval (führend)
- Stopword-Filter
- Lemma-Logik
- Score-Berechnung
- deterministische Gewichtung
### B) Vektor-Retrieval (ergänzend)
- SentenceTransformer: `all-MiniLM-L6-v2`
- FAISS Index (Inner Product)
- Normalisierte Embeddings
- Top-K Suche
### Retrieval-Flow
1. User Prompt
2. Keyword-Scoring
3. FAISS-Suche
4. Score-Fusion
5. Top-N Chunks
6. Kontextaufbau
7. LLM-Antwort
---
## 3. Wissensarchitektur
# 3. Directory Structure (Knowledge Layer)
```
var/knowledge/
├── uploads/
├── chunks/
├── manifest.json
── index.json
├── index.ndjson
── index_meta.json
├── vector.index
└── vector_meta.json
```
### Prinzipien
- Dokumente sind Primärquelle
- Chunks sind abgeleitete Artefakte
- `index.json` ist Single Source of Truth
- Re-Ingest ist deterministisch
- Keine manuelle Chunk-Manipulation
---
## 4. Vektor-Ingest
# 4. NDJSON Index
CLI Command:
## 4.1 index.ndjson
- Single Source of Truth
- One JSON object per line
- Streaming-readable
- No JSON array wrapper
- Scales beyond 200k chunks
Each line contains:
```json
{
"chunk_id": "uuid",
"document_id": "uuid",
"version": 3,
"text": "...",
"meta": { ... }
}
```
NDJSON enables:
- Append-based writes
- Compaction per document
- Memory-safe streaming
- Deterministic rebuilds
---
# 5. Index Metadata
## index_meta.json
Managed by:
- IndexMetaManager
- IndexConfiguration
Contains:
- index_version
- embedding_model
- embedding_dimension
- chunk_size
- overlap
- scoring_version
- index_format
If configuration changes → Global Reindex required.
Guarded by:
`IndexStructureChangedException`
---
# 6. Ingest Pipeline
## 6.1 Core Services
| Service | Responsibility |
|----------|----------------|
| DocumentService | Document lifecycle |
| DocumentVersionRepository | Version persistence |
| KnowledgeIngestService | Chunk generation |
| SimpleChunker | Deterministic splitting |
| TextNormalizer | Text cleanup |
| StopWords | Keyword filtering |
| ChunkManager | NDJSON append + compaction |
| ChunkWriter | Chunk persistence |
| IngestFlow | Step orchestration |
| IngestOrchestrator | Full ingest coordination |
| IngestJobService | Job tracking |
| LockService | Concurrency guard |
---
## 6.2 Local Ingest
Used when:
- A single document version changes
Process:
1. Extract document
2. Normalize text
3. Chunk deterministically
4. Remove previous chunks of document_id
5. Append new chunks to index.ndjson
6. Rebuild FAISS completely
index_version does NOT change.
---
## 6.3 Global Reindex
Used when:
- Embedding model changes
- Chunk size changes
- Overlap changes
- Scoring logic changes
- index_format changes
Process:
1. Re-extract all active document versions
2. Recreate full index.ndjson
3. Rebuild FAISS
4. index_version++
---
# 7. Vector Architecture
## 7.1 vector_ingest.py
Responsibilities:
- Stream-read index.ndjson
- Extract text + chunk_id
- Build embeddings
- Normalize embeddings
- Build FAISS IndexFlatIP
- Write vector.index
- Write vector.meta.json
Execution:
```bash
python vector_ingest.py --index path/to/index.ndjson --out path/to/vector.index
```
Characteristics:
- No partial updates
- No incremental mutation
- Always full rebuild
- Batch size = 64
- normalize_embeddings=True
---
## 7.2 vector_search.py
Responsibilities:
- Load vector.index
- Load vector_meta.json
- Encode query
- Search top-K
- Return JSON
Execution:
```bash
python vector_search.py "query" 5
```
Output:
```json
[
{ "chunk_id": "...", "score": 0.82 }
]
```
---
## 7.3 VectorSearchClient (PHP)
- Executes Python search script
- Parses JSON response
- Returns structured results
- Handles timeout + error states
---
# 8. Hybrid Retrieval
## 8.1 Components
| Class | Role |
|--------|------|
| NdjsonHybridRetriever | Orchestrator |
| NdjsonKeywordSearch | Keyword scoring |
| NdjsonChunkLookup | Chunk resolution |
| VectorSearchClient | Vector bridge |
| CachedRetriever | Cache layer |
---
## 8.2 Retrieval Flow
1. Extract terms (StopWords + normalization)
2. Keyword scoring
3. Vector search
4. Score fusion
5. Limit to N chunks
6. Resolve chunk text
7. Build LLM context
Keyword score remains primary signal.
Vector score augments semantic similarity.
---
# 9. Document Extraction
Supported via:
- DocumentExtractorInterface
- ExtractorResolver
- PdfExtractor
- DocumentLoader
Extraction must return clean UTF-8 text.
Chunking must remain deterministic.
---
# 10. Admin Layer (Symfony)
## Controllers
- DashboardController
- DocumentController
- IngestJobController
- SecurityController
## Entities
- Document
- DocumentVersion
- IngestJob
- User
## Repositories
- DocumentVersionRepository
- UserRepository
---
# 11. Concurrency & Locks
LockService ensures:
- No parallel reindex
- No parallel ingest conflict
- Controlled mutation of index.ndjson
File-based or service-based locking.
---
# 12. Determinism Rules
The system guarantees:
- Same documents + same config = identical index.ndjson
- Same index.ndjson = identical FAISS
- Same query + same index = identical results
No randomness.
No adaptive mutation.
No auto-learning.
---
# 13. LLM Integration
- Context strictly limited to retrieved chunks
- PromptBuilder constructs deterministic system prompt
- ContextService manages history
- SSE streaming enabled
- Model endpoint configurable
LLM never has direct access to full knowledge base.
Only retrieved chunks are injected.
---
# 14. Scalability
Designed for:
- >200k chunks
- Streaming NDJSON reads
- Full FAISS rebuild
- Cache layer for retrieval
- Controlled memory usage
No full-array JSON loads.
---
# 15. Failure Modes
Handled via:
- Missing vector index detection
- Structure drift detection
- Lock collision detection
- Embedding dependency checks
- Python execution errors
- Empty chunk fallback
---
# 16. Non-Goals
This system intentionally does NOT include:
- Online learning
- Embedding mutation
- Incremental FAISS update
- Auto chunk merging
- Self-modifying prompts
All structural changes require explicit reindex.
---
# 17. Design Philosophy
This is a governance-first RAG architecture:
- Deterministic
- Reproducible
- Drift-safe
- Audit-friendly
- Version-controlled
It prioritizes correctness and control over dynamic mutation.
---
# 18. Development Guidelines
When extending the system:
- Never mutate FAISS directly
- Never edit index.ndjson manually
- Always preserve determinism
- Increment index_version only via Global Reindex
- Guard all structural changes
- Maintain streaming compatibility
---
# 19. CLI Commands (Symfony)
Example:
```bash
php bin/console mto:agent:vector:ingest
```
Ablauf:
1. index.json lesen
2. Chunk-Texte laden
3. Embeddings erzeugen
4. FAISS Index erstellen
5. vector.index speichern
6. vector_meta.json schreiben
---
## 5. LLM-Anbindung
Standardmäßig via Ollama.
Konfiguration über ENV:
Custom commands follow namespace:
```
AI_LLM_API_URL=
AI_LLM_MODEL=
AI_LLM_TIMEOUT=
AI_DEBUG=
AI_LOG_PROMPT=
AI_LOG_CONTEXT=
AI_HISTORY_DIR=
mto:agent:*
```
Features:
- Streaming-fähig
- Konfigurierbarer Timeout
- Denkmodus unterdrückbar
- Historienintegration
---
## 6. Frontend
# 20. Summary
Technologie:
This system is a deterministic, enterprise-grade hybrid RAG engine with:
- Bootstrap
- Marked (Markdown)
- DOMPurify
- SSE Streaming
- NDJSON-based streaming index
- Full FAISS rebuild strategy
- Structured ingest pipeline
- Hybrid retrieval
- Admin governance layer
- Strict guardrails
Features:
- Live-Streaming
- Markdown-Rendering
- Abbruch-Funktion
- Chat-Verlauf
- Client-ID per Cookie
- Verlaufslöschung
---
## 7. Logging & Debug
Log-Datei:
```
var/log/agent.log
```
Optional aktivierbar:
- Prompt Logging
- Kontext Logging
- Debug-Modus
---
# Sicherheit & Governance
- Rollenmodell (Super Admin / Knowledge Admin / Redaktion)
- Versionierte Dokumente
- Versionierte Ingest-Profile
- Versionierte System-Prompts
- KI-Endpunkt abstrahiert
- Audit-Logs
- Lock-Mechanismen bei Reindex
---
# Produktstatus
Das System ist:
- Produktionsreif
- Framework-neutral
- Kundenfähig
- Skalierbar
- Erweiterbar (Adminbereich geplant)
Nicht enthalten:
- Autonomes Fine-Tuning
- Live-Lernsystem
- Self-Modifying Knowledge
---
# Unterschied zu generischen KI-Tools
| Generische KI | mitho AI Agent |
|---------------|----------------|
| trainiert auf Internet | basiert auf Ihrem Wissen |
| keine Governance | volle Kontrolle |
| keine Versionierung | Dokument-Versionierung |
| nicht nachvollziehbar | transparente Wissensbasis |
| generisch | unternehmensspezifisch |
---
# Mindestanforderungen
- PHP 8.2+
- Python 3.9+
- faiss
- sentence-transformers
- Ollama (oder kompatibles LLM)
---
# Vision
Dieses System bildet die Grundlage für:
- Agentic Commerce
- Interne Wissenssysteme
- Support-Automatisierung
- Vertriebsassistenz
- Technische Dokumentations-KI
- DSGVO-konforme Unternehmens-KI
---
# Fazit
Der mitho AI Agent ist kein Spielzeug-Chatbot.
Er ist ein strukturiertes, kontrolliertes KI-System mit klarer Wissensbasis, deterministischem Retrieval und professioneller Architektur gebaut für produktiven Unternehmenseinsatz.
It is designed for controlled enterprise deployment, not experimental AI workflows.