mitho AI Agent – Developer Deep Dive
Enterprise Hybrid RAG System (Symfony + NDJSON + FAISS)
1. System Overview
This system implements a deterministic, governance-stable Retrieval Augmented Generation (RAG) architecture based on:
- Symfony (PHP backend)
- NDJSON-based knowledge index
- Full FAISS vector rebuild strategy
- Hybrid retrieval (keyword + vector)
- Deterministic ingest pipeline
- Strict versioning & guardrails
- Lock-based reindex protection
No incremental vector mutation is allowed.
FAISS is always rebuilt from index.ndjson.
2. High-Level Architecture
User Query → Hybrid Retrieval → Context Assembly → Prompt Builder → LLM → Streaming Response (SSE)
Knowledge Flow: Document → Version → Extract → Chunk → NDJSON → FAISS → Retrieval
3. Directory Structure (Knowledge Layer)
var/knowledge/
├── uploads/
├── chunks/
├── index.ndjson
├── index_meta.json
├── vector.index
└── vector_meta.json
4. NDJSON Index
4.1 index.ndjson
- Single Source of Truth
- One JSON object per line
- Streaming-readable
- No JSON array wrapper
- Scales beyond 200k chunks
Each line contains:
{
"chunk_id": "uuid",
"document_id": "uuid",
"version": 3,
"text": "...",
"meta": { ... }
}
NDJSON enables:
- Append-based writes
- Compaction per document
- Memory-safe streaming
- Deterministic rebuilds
5. Index Metadata
index_meta.json
Managed by:
- IndexMetaManager
- IndexConfiguration
Contains:
- index_version
- embedding_model
- embedding_dimension
- chunk_size
- overlap
- scoring_version
- index_format
If configuration changes → Global Reindex required.
Guarded by:
IndexStructureChangedException
6. Ingest Pipeline
6.1 Core Services
| Service | Responsibility |
|---|---|
| DocumentService | Document lifecycle |
| DocumentVersionRepository | Version persistence |
| KnowledgeIngestService | Chunk generation |
| SimpleChunker | Deterministic splitting |
| TextNormalizer | Text cleanup |
| StopWords | Keyword filtering |
| ChunkManager | NDJSON append + compaction |
| ChunkWriter | Chunk persistence |
| IngestFlow | Step orchestration |
| IngestOrchestrator | Full ingest coordination |
| IngestJobService | Job tracking |
| LockService | Concurrency guard |
6.2 Local Ingest
Used when:
- A single document version changes
Process:
- Extract document
- Normalize text
- Chunk deterministically
- Remove previous chunks of document_id
- Append new chunks to index.ndjson
- Rebuild FAISS completely
index_version does NOT change.
6.3 Global Reindex
Used when:
- Embedding model changes
- Chunk size changes
- Overlap changes
- Scoring logic changes
- index_format changes
Process:
- Re-extract all active document versions
- Recreate full index.ndjson
- Rebuild FAISS
- index_version++
7. Vector Architecture
7.1 vector_ingest.py
Responsibilities:
- Stream-read index.ndjson
- Extract text + chunk_id
- Build embeddings
- Normalize embeddings
- Build FAISS IndexFlatIP
- Write vector.index
- Write vector.meta.json
Execution:
python vector_ingest.py --index path/to/index.ndjson --out path/to/vector.index
Characteristics:
- No partial updates
- No incremental mutation
- Always full rebuild
- Batch size = 64
- normalize_embeddings=True
7.2 vector_search.py
Responsibilities:
- Load vector.index
- Load vector_meta.json
- Encode query
- Search top-K
- Return JSON
Execution:
python vector_search.py "query" 5
Output:
[
{ "chunk_id": "...", "score": 0.82 }
]
7.3 VectorSearchClient (PHP)
- Executes Python search script
- Parses JSON response
- Returns structured results
- Handles timeout + error states
8. Hybrid Retrieval
8.1 Components
| Class | Role |
|---|---|
| NdjsonHybridRetriever | Orchestrator |
| NdjsonKeywordSearch | Keyword scoring |
| NdjsonChunkLookup | Chunk resolution |
| VectorSearchClient | Vector bridge |
| CachedRetriever | Cache layer |
8.2 Retrieval Flow
- Extract terms (StopWords + normalization)
- Keyword scoring
- Vector search
- Score fusion
- Limit to N chunks
- Resolve chunk text
- Build LLM context
Keyword score remains primary signal. Vector score augments semantic similarity.
9. Document Extraction
Supported via:
- DocumentExtractorInterface
- ExtractorResolver
- PdfExtractor
- DocumentLoader
Extraction must return clean UTF-8 text. Chunking must remain deterministic.
10. Admin Layer (Symfony)
Controllers
- DashboardController
- DocumentController
- IngestJobController
- SecurityController
Entities
- Document
- DocumentVersion
- IngestJob
- User
Repositories
- DocumentVersionRepository
- UserRepository
11. Concurrency & Locks
LockService ensures:
- No parallel reindex
- No parallel ingest conflict
- Controlled mutation of index.ndjson
File-based or service-based locking.
12. Determinism Rules
The system guarantees:
- Same documents + same config = identical index.ndjson
- Same index.ndjson = identical FAISS
- Same query + same index = identical results
No randomness. No adaptive mutation. No auto-learning.
13. LLM Integration
- Context strictly limited to retrieved chunks
- PromptBuilder constructs deterministic system prompt
- ContextService manages history
- SSE streaming enabled
- Model endpoint configurable
LLM never has direct access to full knowledge base. Only retrieved chunks are injected.
14. Scalability
Designed for:
-
200k chunks
- Streaming NDJSON reads
- Full FAISS rebuild
- Cache layer for retrieval
- Controlled memory usage
No full-array JSON loads.
15. Failure Modes
Handled via:
- Missing vector index detection
- Structure drift detection
- Lock collision detection
- Embedding dependency checks
- Python execution errors
- Empty chunk fallback
16. Non-Goals
This system intentionally does NOT include:
- Online learning
- Embedding mutation
- Incremental FAISS update
- Auto chunk merging
- Self-modifying prompts
All structural changes require explicit reindex.
17. Design Philosophy
This is a governance-first RAG architecture:
- Deterministic
- Reproducible
- Drift-safe
- Audit-friendly
- Version-controlled
It prioritizes correctness and control over dynamic mutation.
18. Development Guidelines
When extending the system:
- Never mutate FAISS directly
- Never edit index.ndjson manually
- Always preserve determinism
- Increment index_version only via Global Reindex
- Guard all structural changes
- Maintain streaming compatibility
19. CLI Commands (Symfony)
Example:
php bin/console mto:agent:vector:ingest
Custom commands follow namespace:
mto:agent:*
20. Summary
This system is a deterministic, enterprise-grade hybrid RAG engine with:
- NDJSON-based streaming index
- Full FAISS rebuild strategy
- Structured ingest pipeline
- Hybrid retrieval
- Admin governance layer
- Strict guardrails
It is designed for controlled enterprise deployment, not experimental AI workflows.