harden code and add messenger services and ne README.md and SYSTEM.,d
This commit is contained in:
604
README.md
604
README.md
@@ -1,250 +1,430 @@
|
||||
# mitho AI Agent (Alpha Version)
|
||||
**Hybrid RAG System auf Symfony-Basis mit Vektor- & Keyword-Retrieval**
|
||||
# mitho AI Agent – Developer Deep Dive
|
||||
|
||||
Enterprise Hybrid RAG System (Symfony + NDJSON + FAISS)
|
||||
|
||||
---
|
||||
|
||||
## Überblick
|
||||
# 1. System Overview
|
||||
|
||||
Der **mitho AI Agent** ist ein produktionsreifes, Symfony-basiertes RAG-System (Retrieval Augmented Generation), das KI nicht frei „raten“ lässt, sondern Antworten strikt auf Basis eines kontrollierten Wissenspools erzeugt.
|
||||
This system implements a deterministic, governance-stable Retrieval Augmented Generation (RAG) architecture based on:
|
||||
|
||||
> **Leitsatz:**
|
||||
> *„Wir nutzen KI nicht, um kreativ zu raten, sondern um verlässlich auf Basis Ihres Wissens zu antworten.“*
|
||||
- Symfony (PHP backend)
|
||||
- NDJSON-based knowledge index
|
||||
- Full FAISS vector rebuild strategy
|
||||
- Hybrid retrieval (keyword + vector)
|
||||
- Deterministic ingest pipeline
|
||||
- Strict versioning & guardrails
|
||||
- Lock-based reindex protection
|
||||
|
||||
Das System kombiniert:
|
||||
|
||||
- Large Language Model (LLM, z. B. Qwen via Ollama)
|
||||
- Keyword-basiertes Retrieval
|
||||
- FAISS-Vektor-Suche
|
||||
- Versionierte Wissensstruktur (Chunks + Index)
|
||||
- Streaming-Ausgabe via Server-Sent Events (SSE)
|
||||
- Persistente Chat-Historie pro Client
|
||||
No incremental vector mutation is allowed.
|
||||
FAISS is always rebuilt from `index.ndjson`.
|
||||
|
||||
---
|
||||
|
||||
# Architektur
|
||||
# 2. High-Level Architecture
|
||||
|
||||
## 1. Backend
|
||||
User Query
|
||||
→ Hybrid Retrieval
|
||||
→ Context Assembly
|
||||
→ Prompt Builder
|
||||
→ LLM
|
||||
→ Streaming Response (SSE)
|
||||
|
||||
**Technologie**
|
||||
|
||||
- PHP 8.2+
|
||||
- Symfony 7.4
|
||||
- Monolog Logging
|
||||
- Symfony Cache
|
||||
- Session Support
|
||||
|
||||
### Zentrale Komponenten
|
||||
|
||||
| Komponente | Aufgabe |
|
||||
|------------|----------|
|
||||
| `AgentRunner` | Orchestriert Prompt, Kontext & LLM |
|
||||
| `PromptBuilder` | Baut System- & User-Prompt |
|
||||
| `ContextService` | Historienverwaltung |
|
||||
| `ChunkKeywordRetriever` | Keyword-Scoring |
|
||||
| `VectorSearchClient` | Python-FAISS-Anbindung |
|
||||
| `KnowledgeIngestService` | Dokument → Chunks |
|
||||
| `ChunkIndexWriter` | index.json Verwaltung |
|
||||
| `CachedRetriever` | Performance-Optimierung |
|
||||
Knowledge Flow:
|
||||
Document → Version → Extract → Chunk → NDJSON → FAISS → Retrieval
|
||||
|
||||
---
|
||||
|
||||
## 2. Hybrid Retrieval (Produktionsarchitektur)
|
||||
|
||||
Das System nutzt eine **hybride Sucharchitektur**:
|
||||
|
||||
### A) Keyword-Retrieval (führend)
|
||||
|
||||
- Stopword-Filter
|
||||
- Lemma-Logik
|
||||
- Score-Berechnung
|
||||
- deterministische Gewichtung
|
||||
|
||||
### B) Vektor-Retrieval (ergänzend)
|
||||
|
||||
- SentenceTransformer: `all-MiniLM-L6-v2`
|
||||
- FAISS Index (Inner Product)
|
||||
- Normalisierte Embeddings
|
||||
- Top-K Suche
|
||||
|
||||
### Retrieval-Flow
|
||||
|
||||
1. User Prompt
|
||||
2. Keyword-Scoring
|
||||
3. FAISS-Suche
|
||||
4. Score-Fusion
|
||||
5. Top-N Chunks
|
||||
6. Kontextaufbau
|
||||
7. LLM-Antwort
|
||||
|
||||
---
|
||||
|
||||
## 3. Wissensarchitektur
|
||||
# 3. Directory Structure (Knowledge Layer)
|
||||
|
||||
```
|
||||
var/knowledge/
|
||||
├── uploads/
|
||||
├── chunks/
|
||||
├── manifest.json
|
||||
└── index.json
|
||||
├── index.ndjson
|
||||
├── index_meta.json
|
||||
├── vector.index
|
||||
└── vector_meta.json
|
||||
```
|
||||
|
||||
### Prinzipien
|
||||
|
||||
- Dokumente sind Primärquelle
|
||||
- Chunks sind abgeleitete Artefakte
|
||||
- `index.json` ist Single Source of Truth
|
||||
- Re-Ingest ist deterministisch
|
||||
- Keine manuelle Chunk-Manipulation
|
||||
|
||||
---
|
||||
|
||||
## 4. Vektor-Ingest
|
||||
# 4. NDJSON Index
|
||||
|
||||
CLI Command:
|
||||
## 4.1 index.ndjson
|
||||
|
||||
- Single Source of Truth
|
||||
- One JSON object per line
|
||||
- Streaming-readable
|
||||
- No JSON array wrapper
|
||||
- Scales beyond 200k chunks
|
||||
|
||||
Each line contains:
|
||||
|
||||
```json
|
||||
{
|
||||
"chunk_id": "uuid",
|
||||
"document_id": "uuid",
|
||||
"version": 3,
|
||||
"text": "...",
|
||||
"meta": { ... }
|
||||
}
|
||||
```
|
||||
|
||||
NDJSON enables:
|
||||
- Append-based writes
|
||||
- Compaction per document
|
||||
- Memory-safe streaming
|
||||
- Deterministic rebuilds
|
||||
|
||||
---
|
||||
|
||||
# 5. Index Metadata
|
||||
|
||||
## index_meta.json
|
||||
|
||||
Managed by:
|
||||
|
||||
- IndexMetaManager
|
||||
- IndexConfiguration
|
||||
|
||||
Contains:
|
||||
|
||||
- index_version
|
||||
- embedding_model
|
||||
- embedding_dimension
|
||||
- chunk_size
|
||||
- overlap
|
||||
- scoring_version
|
||||
- index_format
|
||||
|
||||
If configuration changes → Global Reindex required.
|
||||
|
||||
Guarded by:
|
||||
`IndexStructureChangedException`
|
||||
|
||||
---
|
||||
|
||||
# 6. Ingest Pipeline
|
||||
|
||||
## 6.1 Core Services
|
||||
|
||||
| Service | Responsibility |
|
||||
|----------|----------------|
|
||||
| DocumentService | Document lifecycle |
|
||||
| DocumentVersionRepository | Version persistence |
|
||||
| KnowledgeIngestService | Chunk generation |
|
||||
| SimpleChunker | Deterministic splitting |
|
||||
| TextNormalizer | Text cleanup |
|
||||
| StopWords | Keyword filtering |
|
||||
| ChunkManager | NDJSON append + compaction |
|
||||
| ChunkWriter | Chunk persistence |
|
||||
| IngestFlow | Step orchestration |
|
||||
| IngestOrchestrator | Full ingest coordination |
|
||||
| IngestJobService | Job tracking |
|
||||
| LockService | Concurrency guard |
|
||||
|
||||
---
|
||||
|
||||
## 6.2 Local Ingest
|
||||
|
||||
Used when:
|
||||
- A single document version changes
|
||||
|
||||
Process:
|
||||
|
||||
1. Extract document
|
||||
2. Normalize text
|
||||
3. Chunk deterministically
|
||||
4. Remove previous chunks of document_id
|
||||
5. Append new chunks to index.ndjson
|
||||
6. Rebuild FAISS completely
|
||||
|
||||
index_version does NOT change.
|
||||
|
||||
---
|
||||
|
||||
## 6.3 Global Reindex
|
||||
|
||||
Used when:
|
||||
- Embedding model changes
|
||||
- Chunk size changes
|
||||
- Overlap changes
|
||||
- Scoring logic changes
|
||||
- index_format changes
|
||||
|
||||
Process:
|
||||
|
||||
1. Re-extract all active document versions
|
||||
2. Recreate full index.ndjson
|
||||
3. Rebuild FAISS
|
||||
4. index_version++
|
||||
|
||||
---
|
||||
|
||||
# 7. Vector Architecture
|
||||
|
||||
## 7.1 vector_ingest.py
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- Stream-read index.ndjson
|
||||
- Extract text + chunk_id
|
||||
- Build embeddings
|
||||
- Normalize embeddings
|
||||
- Build FAISS IndexFlatIP
|
||||
- Write vector.index
|
||||
- Write vector.meta.json
|
||||
|
||||
Execution:
|
||||
|
||||
```bash
|
||||
python vector_ingest.py --index path/to/index.ndjson --out path/to/vector.index
|
||||
```
|
||||
|
||||
Characteristics:
|
||||
|
||||
- No partial updates
|
||||
- No incremental mutation
|
||||
- Always full rebuild
|
||||
- Batch size = 64
|
||||
- normalize_embeddings=True
|
||||
|
||||
---
|
||||
|
||||
## 7.2 vector_search.py
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- Load vector.index
|
||||
- Load vector_meta.json
|
||||
- Encode query
|
||||
- Search top-K
|
||||
- Return JSON
|
||||
|
||||
Execution:
|
||||
|
||||
```bash
|
||||
python vector_search.py "query" 5
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```json
|
||||
[
|
||||
{ "chunk_id": "...", "score": 0.82 }
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7.3 VectorSearchClient (PHP)
|
||||
|
||||
- Executes Python search script
|
||||
- Parses JSON response
|
||||
- Returns structured results
|
||||
- Handles timeout + error states
|
||||
|
||||
---
|
||||
|
||||
# 8. Hybrid Retrieval
|
||||
|
||||
## 8.1 Components
|
||||
|
||||
| Class | Role |
|
||||
|--------|------|
|
||||
| NdjsonHybridRetriever | Orchestrator |
|
||||
| NdjsonKeywordSearch | Keyword scoring |
|
||||
| NdjsonChunkLookup | Chunk resolution |
|
||||
| VectorSearchClient | Vector bridge |
|
||||
| CachedRetriever | Cache layer |
|
||||
|
||||
---
|
||||
|
||||
## 8.2 Retrieval Flow
|
||||
|
||||
1. Extract terms (StopWords + normalization)
|
||||
2. Keyword scoring
|
||||
3. Vector search
|
||||
4. Score fusion
|
||||
5. Limit to N chunks
|
||||
6. Resolve chunk text
|
||||
7. Build LLM context
|
||||
|
||||
Keyword score remains primary signal.
|
||||
Vector score augments semantic similarity.
|
||||
|
||||
---
|
||||
|
||||
# 9. Document Extraction
|
||||
|
||||
Supported via:
|
||||
|
||||
- DocumentExtractorInterface
|
||||
- ExtractorResolver
|
||||
- PdfExtractor
|
||||
- DocumentLoader
|
||||
|
||||
Extraction must return clean UTF-8 text.
|
||||
Chunking must remain deterministic.
|
||||
|
||||
---
|
||||
|
||||
# 10. Admin Layer (Symfony)
|
||||
|
||||
## Controllers
|
||||
|
||||
- DashboardController
|
||||
- DocumentController
|
||||
- IngestJobController
|
||||
- SecurityController
|
||||
|
||||
## Entities
|
||||
|
||||
- Document
|
||||
- DocumentVersion
|
||||
- IngestJob
|
||||
- User
|
||||
|
||||
## Repositories
|
||||
|
||||
- DocumentVersionRepository
|
||||
- UserRepository
|
||||
|
||||
---
|
||||
|
||||
# 11. Concurrency & Locks
|
||||
|
||||
LockService ensures:
|
||||
|
||||
- No parallel reindex
|
||||
- No parallel ingest conflict
|
||||
- Controlled mutation of index.ndjson
|
||||
|
||||
File-based or service-based locking.
|
||||
|
||||
---
|
||||
|
||||
# 12. Determinism Rules
|
||||
|
||||
The system guarantees:
|
||||
|
||||
- Same documents + same config = identical index.ndjson
|
||||
- Same index.ndjson = identical FAISS
|
||||
- Same query + same index = identical results
|
||||
|
||||
No randomness.
|
||||
No adaptive mutation.
|
||||
No auto-learning.
|
||||
|
||||
---
|
||||
|
||||
# 13. LLM Integration
|
||||
|
||||
- Context strictly limited to retrieved chunks
|
||||
- PromptBuilder constructs deterministic system prompt
|
||||
- ContextService manages history
|
||||
- SSE streaming enabled
|
||||
- Model endpoint configurable
|
||||
|
||||
LLM never has direct access to full knowledge base.
|
||||
Only retrieved chunks are injected.
|
||||
|
||||
---
|
||||
|
||||
# 14. Scalability
|
||||
|
||||
Designed for:
|
||||
|
||||
- >200k chunks
|
||||
- Streaming NDJSON reads
|
||||
- Full FAISS rebuild
|
||||
- Cache layer for retrieval
|
||||
- Controlled memory usage
|
||||
|
||||
No full-array JSON loads.
|
||||
|
||||
---
|
||||
|
||||
# 15. Failure Modes
|
||||
|
||||
Handled via:
|
||||
|
||||
- Missing vector index detection
|
||||
- Structure drift detection
|
||||
- Lock collision detection
|
||||
- Embedding dependency checks
|
||||
- Python execution errors
|
||||
- Empty chunk fallback
|
||||
|
||||
---
|
||||
|
||||
# 16. Non-Goals
|
||||
|
||||
This system intentionally does NOT include:
|
||||
|
||||
- Online learning
|
||||
- Embedding mutation
|
||||
- Incremental FAISS update
|
||||
- Auto chunk merging
|
||||
- Self-modifying prompts
|
||||
|
||||
All structural changes require explicit reindex.
|
||||
|
||||
---
|
||||
|
||||
# 17. Design Philosophy
|
||||
|
||||
This is a governance-first RAG architecture:
|
||||
|
||||
- Deterministic
|
||||
- Reproducible
|
||||
- Drift-safe
|
||||
- Audit-friendly
|
||||
- Version-controlled
|
||||
|
||||
It prioritizes correctness and control over dynamic mutation.
|
||||
|
||||
---
|
||||
|
||||
# 18. Development Guidelines
|
||||
|
||||
When extending the system:
|
||||
|
||||
- Never mutate FAISS directly
|
||||
- Never edit index.ndjson manually
|
||||
- Always preserve determinism
|
||||
- Increment index_version only via Global Reindex
|
||||
- Guard all structural changes
|
||||
- Maintain streaming compatibility
|
||||
|
||||
---
|
||||
|
||||
# 19. CLI Commands (Symfony)
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
php bin/console mto:agent:vector:ingest
|
||||
```
|
||||
|
||||
Ablauf:
|
||||
|
||||
1. index.json lesen
|
||||
2. Chunk-Texte laden
|
||||
3. Embeddings erzeugen
|
||||
4. FAISS Index erstellen
|
||||
5. vector.index speichern
|
||||
6. vector_meta.json schreiben
|
||||
|
||||
---
|
||||
|
||||
## 5. LLM-Anbindung
|
||||
|
||||
Standardmäßig via Ollama.
|
||||
|
||||
Konfiguration über ENV:
|
||||
Custom commands follow namespace:
|
||||
|
||||
```
|
||||
AI_LLM_API_URL=
|
||||
AI_LLM_MODEL=
|
||||
AI_LLM_TIMEOUT=
|
||||
AI_DEBUG=
|
||||
AI_LOG_PROMPT=
|
||||
AI_LOG_CONTEXT=
|
||||
AI_HISTORY_DIR=
|
||||
mto:agent:*
|
||||
```
|
||||
|
||||
Features:
|
||||
|
||||
- Streaming-fähig
|
||||
- Konfigurierbarer Timeout
|
||||
- Denkmodus unterdrückbar
|
||||
- Historienintegration
|
||||
|
||||
---
|
||||
|
||||
## 6. Frontend
|
||||
# 20. Summary
|
||||
|
||||
Technologie:
|
||||
This system is a deterministic, enterprise-grade hybrid RAG engine with:
|
||||
|
||||
- Bootstrap
|
||||
- Marked (Markdown)
|
||||
- DOMPurify
|
||||
- SSE Streaming
|
||||
- NDJSON-based streaming index
|
||||
- Full FAISS rebuild strategy
|
||||
- Structured ingest pipeline
|
||||
- Hybrid retrieval
|
||||
- Admin governance layer
|
||||
- Strict guardrails
|
||||
|
||||
Features:
|
||||
|
||||
- Live-Streaming
|
||||
- Markdown-Rendering
|
||||
- Abbruch-Funktion
|
||||
- Chat-Verlauf
|
||||
- Client-ID per Cookie
|
||||
- Verlaufslöschung
|
||||
|
||||
---
|
||||
|
||||
## 7. Logging & Debug
|
||||
|
||||
Log-Datei:
|
||||
|
||||
```
|
||||
var/log/agent.log
|
||||
```
|
||||
|
||||
Optional aktivierbar:
|
||||
|
||||
- Prompt Logging
|
||||
- Kontext Logging
|
||||
- Debug-Modus
|
||||
|
||||
---
|
||||
|
||||
# Sicherheit & Governance
|
||||
|
||||
- Rollenmodell (Super Admin / Knowledge Admin / Redaktion)
|
||||
- Versionierte Dokumente
|
||||
- Versionierte Ingest-Profile
|
||||
- Versionierte System-Prompts
|
||||
- KI-Endpunkt abstrahiert
|
||||
- Audit-Logs
|
||||
- Lock-Mechanismen bei Reindex
|
||||
|
||||
---
|
||||
|
||||
# Produktstatus
|
||||
|
||||
Das System ist:
|
||||
|
||||
- Produktionsreif
|
||||
- Framework-neutral
|
||||
- Kundenfähig
|
||||
- Skalierbar
|
||||
- Erweiterbar (Adminbereich geplant)
|
||||
|
||||
Nicht enthalten:
|
||||
|
||||
- Autonomes Fine-Tuning
|
||||
- Live-Lernsystem
|
||||
- Self-Modifying Knowledge
|
||||
|
||||
---
|
||||
|
||||
# Unterschied zu generischen KI-Tools
|
||||
|
||||
| Generische KI | mitho AI Agent |
|
||||
|---------------|----------------|
|
||||
| trainiert auf Internet | basiert auf Ihrem Wissen |
|
||||
| keine Governance | volle Kontrolle |
|
||||
| keine Versionierung | Dokument-Versionierung |
|
||||
| nicht nachvollziehbar | transparente Wissensbasis |
|
||||
| generisch | unternehmensspezifisch |
|
||||
|
||||
---
|
||||
|
||||
# Mindestanforderungen
|
||||
|
||||
- PHP 8.2+
|
||||
- Python 3.9+
|
||||
- faiss
|
||||
- sentence-transformers
|
||||
- Ollama (oder kompatibles LLM)
|
||||
|
||||
---
|
||||
|
||||
# Vision
|
||||
|
||||
Dieses System bildet die Grundlage für:
|
||||
|
||||
- Agentic Commerce
|
||||
- Interne Wissenssysteme
|
||||
- Support-Automatisierung
|
||||
- Vertriebsassistenz
|
||||
- Technische Dokumentations-KI
|
||||
- DSGVO-konforme Unternehmens-KI
|
||||
|
||||
---
|
||||
|
||||
# Fazit
|
||||
|
||||
Der mitho AI Agent ist kein Spielzeug-Chatbot.
|
||||
|
||||
Er ist ein strukturiertes, kontrolliertes KI-System mit klarer Wissensbasis, deterministischem Retrieval und professioneller Architektur – gebaut für produktiven Unternehmenseinsatz.
|
||||
It is designed for controlled enterprise deployment, not experimental AI workflows.
|
||||
|
||||
Reference in New Issue
Block a user