Architecture v4.12.0

Technical
Infrastructure

The high-precision instrument powering enterprise document intelligence. Built for scale, security, and sub-second latency.

Response Latency

Global Edge Network Performance

speed

< 200ms

p95 API Response

< 5 min

AI Processing (async)

327+

Test Cases (72% coverage)

terminalKernel Stack

FastAPI + Python 3.12PostgreSQL 16 + pgvectorRedis 7 AlpineMinIO (S3-compat)ZITADEL (OIDC)Celery + WebSocket + SSE

"Architecture optimized for asynchronous I/O and vector search embedding throughput."

System Topology

Stage 01

API + Auth Layer

FastAPI on Uvicorn (ASGI). OIDC JWT validated against ZITADEL JWKS endpoint on every request. Permission cache in Redis for sub-millisecond lookups.

Authorization: Bearer <oidc_jwt> # Base: /api/v1/ # Swagger: /api/docs

Stage 02

AI Processing

Celery async workers: text extraction (pdfplumber/python-docx), LangChain classification + summarization, OpenAI embeddings, pgvector chunk indexing.

Active

Stage 03

Data Layer

PostgreSQL 16 + pgvector for primary data and semantic search. Redis 7 for Celery queues and permission cache. MinIO (S3-compatible) for all file objects.

Active

Stage 04

Real-Time Layer

WebSocket endpoint for live presence, cursor tracking, and lock notifications. SSE streaming for document Q&A token-by-token delivery.

Active

Scalability & Resource Profiles

Tier	Compute Node	Storage Throughput	Max Concurrency	Deployment
Sandbox	2 vCPU / 4 GB RAM	250 MB/s	50 ops/sec	Docker Compose
Standard	8 vCPU / 32 GB RAM	1.2 GB/s	450 ops/sec	K8s / Helm v3
Enterprise High-Perf	32 vCPU / 128 GB RAM	5.5 GB/s (NVMe)	2,500+ ops/sec	Bare Metal / K8s

Neural Engine

AI Processing Throughput

The AI pipeline runs as a Celery async task. Classification and summarization complete within 5 minutes of upload. Files are streamed directly to MinIO — the API server never buffers file bytes in memory. Permission lookups are cached in Redis and invalidated immediately on mutation.

Alembic zero-downtime DB migrations (expand/contract)
Full version history — no version ever deleted
Multi-tenant ORM-level org_id isolation
LangChain pluggable AI provider (OpenAI / Claude / Gemini / Ollama)

pipeline_config.yaml

version: "2.4.0-stable"

ai_pipeline:

provider: "openai" # or anthropic / gemini / ollama

process_on_upload: true

embedding_dimensions: 1536

storage:

database: "postgres+asyncpg://..."

cache: "redis://redis:6379"

object_store: "minio:9000"

guarantees:

api_p95_latency: "<200ms"

ai_processing_sla: "<5min"

Ready for Infrastructure Integration?

Detailed API documentation, OpenAPI schemas, and deployment scripts are available in the partner portal.

Request API Access API Reference

TechnicalInfrastructure

Response Latency

terminalKernel Stack

System Topology

API + Auth Layer

AI Processing

Data Layer

Real-Time Layer

Scalability & Resource Profiles

AI Processing Throughput

Ready for Infrastructure Integration?

Technical
Infrastructure