Platform Architecture Deep-Dive

From upload to insight in under five minutes.

A full technical walkthrough of the VyXlo Content Service Platform — from OIDC authentication through AI extraction, dual-mode search, real-time collaboration, approval workflows, and immutable audit — deployed across seven containerized services with documented non-functional requirements.

API Response Latency

< 200ms p95 SYNCHRONOUS

REST Endpoints

AI Pipeline Steps

327+

Test Cases

Containers

AI ProcessingINDEXED

FINANCIAL

Classification

0.97

Confidence

148

Chunks Indexed

1536

Embedding Dims

Active Node 01

V.X-900

// Layered Architecture

System Architecture

VyXlo is organized into three horizontal layers. The Client Layer covers any consumer of the platform API — the reference Next.js web application, partner-built UIs, mobile applications, and automated integration pipelines. All communication to the platform traverses HTTPS, with WebSocket upgrade for real-time presence channels and Server-Sent Events for token-streaming AI responses.

The API Layer is a single FastAPI application running on Python 3.12 with Uvicorn ASGI. It handles all 64 REST endpoints plus the WebSocket endpoint and SSE streaming responses. The same process hosts a Celery Worker and Celery Beat scheduler in separate containers — all three share the codebase image, configured via environment variables to run different entry points. Flower provides a real-time Celery task monitoring dashboard at port 5555.

The Data Layer is composed of four independent services: PostgreSQL 16 with pgvector for all persistent data and both search modes; Redis 7 for session/permission caching and as the Celery task queue + result backend; MinIO for S3-compatible object storage with presigned URL generation; and ZITADEL as the external identity provider for OIDC/PKCE token issuance, SSO, MFA, and SAML/LDAP federation.

System Topology

Client Layer

Next.js · Partner UIs · Mobile Apps

HTTPS · WS · SSE

API Layer — FastAPI 0.115+

64 REST · WebSocket · SSE

Celery Worker · Celery Beat

Flower :5555

PG 16

pgvector · :5432

Redis 7

cache · :6379

MinIO

S3 · :9000

ZITADEL

OIDC/PKCE

storage

PostgreSQL 16 + pgvector

Primary data store for all persistent records. The pgvector extension adds HNSW vector indexes for semantic search alongside standard tsvector full-text indexes. Async access via asyncpg + SQLAlchemy 2.0.

pgvector · tsvector · asyncpg · Alembic

memory

Redis 7 + Celery

Dual role: in-memory cache for user sessions and permission lookups (reducing DB hits on hot paths), and message broker + result backend for the Celery task queue. Rate limit counters also stored in Redis via slowapi.

Redis 7 Alpine · Celery 5.4 · slowapi

cloud

MinIO + ZITADEL

MinIO provides S3-compatible blob storage for all uploaded files with server-side encryption and 15-minute presigned URL generation. ZITADEL handles all identity operations — no credentials are stored in VyXlo's database.

MinIO S3 · Presigned URLs · ZITADEL OIDC

// Identity Layer

ZITADEL OIDC / PKCE Authentication

VyXlo delegates all identity operations to ZITADEL, an enterprise-grade open-source identity platform. VyXlo stores no passwords — authentication uses the OAuth 2.0 Authorization Code flow with PKCE (Proof Key for Code Exchange), which is the secure-by-default standard for public clients including single-page applications and mobile apps.

ZITADEL supports multiple authentication mechanisms out of the box: username/password with TOTP or SMS MFA; social login via Google, GitHub, and Microsoft; passkeys and WebAuthn for passwordless authentication; enterprise SSO via SAML 2.0; and LDAP/Active Directory federation for organizations with existing identity infrastructure.

The OIDC discovery document at /.well-known/openid-configuration is fetched by the API server at startup to resolve signing keys. Every inbound request with a Bearer token is validated: signature checked, expiry verified, and organization_id extracted from the JWT claims. Token validation never touches ZITADEL on the hot path — it uses cached public keys with TTL refresh.

Three Integration Patterns

SSO FederationEnterprise

Connect your existing SAML 2.0 IdP (Okta, Azure AD, Ping) or LDAP/AD to ZITADEL. Users authenticate against their corporate credentials; ZITADEL issues OIDC tokens to VyXlo. Zero new passwords to manage.

Embedded AuthSaaS / Mid-market

Use ZITADEL's hosted login page or embed authentication using ZITADEL's component library. Supports MFA, passkeys, and social login with your custom branding.

Service AccountIntegration

For automated pipelines and backend-to-backend integrations, ZITADEL issues machine credentials using the OAuth 2.0 client credentials flow. No user interaction required.

PKCE Auth Flow — Every API Request

Step 1: Token Acquisition

Client initiates PKCE flow with code_challenge. ZITADEL authenticates user, returns authorization code. Client exchanges code + verifier for access_token + id_token.

Step 2: API Call

Authorization: Bearer <JWT>

FastAPI middleware validates signature using ZITADEL's cached JWKS. Extracts sub, org_id, role from claims.

Step 3: Authorization

Four-layer check: valid JWT → org_id scope → minimum role level → resource permission (NONE→ADMIN). All layers enforced on every request.

// System Architecture

The Six-Stage Processing Pipeline

Every document uploaded to VyXlo traverses six discrete pipeline stages. Stages 01–03 are executed automatically and asynchronously by the Celery worker immediately after upload. Stages 04–06 represent ongoing operational capabilities available throughout the document's active life.

01upload_file

Ingestion

Two-step upload protocol: first POST /documents to create a metadata record and receive a document ID, then POST /documents/{id}/upload to stream raw file bytes directly to MinIO object storage. The API server never buffers file content in memory — the upload is streamed byte-for-byte to the MinIO S3-compatible bucket, eliminating memory pressure regardless of file size. Supported formats include PDF, DOCX, XLSX, PPTX, images, and plain text. Per-organization size limits are configurable in the admin dashboard. Every upload automatically creates a new document version, preserving the full version history forever — prior versions remain downloadable even after the document is updated.

POST /documents → POST /documents/{id}/upload

02psychology

Extraction

Upload completion triggers a Celery asynchronous task that runs the complete 7-step AI pipeline without blocking the API response. Text is extracted using pdfplumber for PDF files (preserving layout and table structure) or python-docx for DOCX files. The extracted text is passed to LangChain with the configured LLM provider (OpenAI GPT-4, Anthropic Claude, Google Gemini, or Ollama for air-gapped deployments). The LLM classifies the document into one of the supported categories (Legal, Financial, Medical, Logistics, Engineering, HR, Procurement, Real Estate, Compliance, R&D) and returns a confidence score between 0 and 1. Summarization, keyword extraction (up to 20 terms), and named entity extraction (people, organizations, dates, locations) follow as separate LLM calls. All results are stored as JSON fields on the Document record: ai_classification, ai_confidence, ai_summary, ai_keywords, and ai_entities.

LangChain · pdfplumber · python-docx · tiktoken

03account_tree

Embedding

After extraction, the document text is split into overlapping chunks using tiktoken — the same tokenizer used by OpenAI models — to ensure chunks never exceed the model's context window. Each chunk is independently embedded into a 1536-dimension dense vector using the configured embedding model (text-embedding-3-small by default). Vectors are stored in PostgreSQL via the pgvector extension with an HNSW (Hierarchical Navigable Small World) index, which delivers approximate nearest-neighbor queries in sub-millisecond time even at millions of vectors. The chunk_index_status field transitions from PENDING → INDEXING → INDEXED or ERROR, allowing the frontend to show real-time progress. Once indexed, the document is queryable via semantic cosine-distance search — finding conceptually related content across the entire organization corpus without requiring exact keyword matches.

pgvector 1536-dim · HNSW index · chunk_index_status

04forum

Interaction

Every indexed document supports two independent search modes operating in parallel. Full-text search uses PostgreSQL tsvector weighted indexes to execute BM25-ranked keyword queries with folder scoping, status filtering, and document type filtering — sub-10ms at scale. Semantic search uses pgvector cosine-distance similarity against the document's chunk embeddings to find conceptually related content across all indexed documents in the organization. For document Q&A, POST /chat/document/{id} initiates a Retrieval-Augmented Generation (RAG) session: the user's question is embedded, the top-K most semantically relevant chunks are retrieved from pgvector, assembled into a context window, and passed to the LLM. The LLM response is streamed back token-by-token via Server-Sent Events (SSE), giving users real-time token streaming in the UI. The final done event in the SSE stream carries citations — the exact chunk IDs and document sections that informed the answer.

tsvector · pgvector · SSE streaming · session persist

05task_alt

Workflow

Documents progress through structured approval chains before reaching PUBLISHED status. Workflows support sequential and parallel step configurations: sequential steps must complete one at a time in order, while parallel steps all activate simultaneously and require unanimous approval before the workflow advances. Each step can be assigned to an individual user or an entire department — in department-assigned steps, any member of the department with sufficient role level may approve. Approvers must provide a reason when rejecting a step, which is surfaced to the document owner along with email notification. Deadline tracking with SLA-based escalation ensures overdue approvals trigger automatic escalation to the assignee's manager. Full workflow cancellation is available to document owners and ADMINs at any time. Workflow state changes update the document status: PENDING_APPROVAL while active, APPROVED on completion, and back to IN_REVIEW on rejection.

Sequential · Parallel · Dept. assignment · SLA

06verified_user

Audit

Every action that mutates or accesses protected resources generates an immutable audit record — there is no way to bypass this at the API level. The audit log captures: actor (user ID, email, role), action type (one of CREATE, UPDATE, DELETE, ACCESS, DOWNLOAD, PERMISSION_CHANGE, WORKFLOW, EXPORT), target resource (resource type + ID), timestamp (ISO 8601 with timezone), IP address, HTTP method and path, before/after JSON diff for UPDATE events, and the outcome (success or failure with error code). Audit records are append-only — once written, they cannot be edited or deleted by any role including SUPER_ADMIN. Administrators can query and export the full audit log filtered by date range, actor, event type, or resource. Retention policies per organization control how long audit records are kept before archival.

8 event types · Append-only · Exportable

// State Machine

Document Lifecycle

Every document in VyXlo follows a strictly validated status lifecycle. State transitions are enforced server-side — invalid transitions are rejected with a descriptive 409 Conflict error. Every transition is captured in the immutable audit log with the actor, timestamp, and reason.

Documents begin in DRAFT state upon creation. The owner submits for review, transitioning to IN_REVIEW. When a workflow is attached and activated, the document advances to PENDING_APPROVAL. Successful completion of all approval steps transitions to APPROVED. The owner or an administrator may then publish the document — making it visible to all org members with READ permission or above — transitioning to PUBLISHED. Finally, documents reaching the end of their useful life are moved to ARCHIVED, where they remain readable but appear in a separate archive view and are subject to retention policy enforcement.

Rejection at any workflow step returns the document to IN_REVIEW with the rejection reason surfaced to the owner. Documents can be returned to DRAFT from IN_REVIEW or APPROVED states. Soft-delete is a separate status flag and does not affect the lifecycle state — soft-deleted documents retain their last lifecycle status and are recoverable by ADMINs.

upload_file

DRAFT→ IN_REVIEW

Document created. AI pipeline runs. Owner editing permitted. Not visible to other users beyond those with explicit permission.

document_scanner

IN_REVIEW→ PENDING_APPROVAL or ← DRAFT

Sent for review. Read access granted to reviewers. Workflow may be attached and activated.

account_tree

PENDING_APPROVAL→ APPROVED or ← IN_REVIEW

Active approval workflow. Each step must be approved sequentially or in parallel. Rejection returns to IN_REVIEW.

fact_check

APPROVED→ PUBLISHED or ← DRAFT

All workflow steps passed. Document ready for publication. Approvers notified.

public

PUBLISHED→ ARCHIVED

Organization-visible. All users with org READ permission or above can discover and view.

inventory_2

ARCHIVEDTerminal

End-of-life state. Readable but not editable. Retention policy enforced by Celery Beat scheduler.

upload_file

DRAFT

Created

document_scanner

IN_REVIEW

Sent for review

account_tree

PENDING_APPROVAL

Workflow active

fact_check

APPROVED

All steps passed

public

PUBLISHED

Org-visible

inventory_2

ARCHIVED

Retention enforced

// Live Collaboration

Real-Time Layer: WebSocket + SSE

hub

WebSocket Presence Channel

When a user opens a document, the frontend upgrades to a WebSocket connection on WS /ws/documents/{id}. The server maintains a presence map for that document and broadcasts join/leave/cursor events to all connected clients in real time. Document locking is also mediated through the WebSocket: a client requests a write lock via the lock_request message; the server responds with lock_acquired or lock_denied, and broadcasts the lock state to all presence channel members so they see the "locked by X" indicator immediately.

# WebSocket message types

presence_join { user_id, name, avatar }

presence_leave { user_id }

cursor { user_id, position }

lock_acquired { user_id, expires_at }

lock_released { user_id }

stream

SSE Token Streaming

Document Q&A uses Server-Sent Events to stream the LLM response token-by-token to the browser. The client sends a question to POST /chat/document/{id} which immediately opens an SSE stream. Tokens arrive as data: {token} events, giving the user real-time feedback as the AI formulates its answer. The final event type is done and carries the full citations array — the specific chunk IDs and source passages from the document corpus that informed the answer. Chat sessions persist server-side, enabling multi-turn document conversations.

# SSE stream events

data: "The Q4 revenue"

data: " grew by 18%"

data: " year-on-year"

event: done

data: {"citations": [chunk_id: 14, 22]}

// Infrastructure

Seven-Container Deployment

The VyXlo platform ships as a Docker Compose stack with seven containers. Three containers share the same application image — api, celery_worker, and celery_beat — differentiated purely by their entry point command. The remaining four containers are off-the-shelf infrastructure images pinned to specific versions.

The entire stack launches with a single docker compose up -d command. Health checks on each container ensure the API server only starts after PostgreSQL and Redis are ready. Database migrations run automatically via an Alembic upgrade head in the API container entrypoint.

The Docker Compose specification is Kubernetes-ready — the service definitions translate directly to Kubernetes Deployments, with MinIO replaceable by any S3-compatible service (AWS S3, GCS, Azure Blob) and ZITADEL deployable on the same cluster or consumed as ZITADEL Cloud SaaS. Horizontal scaling of the API and Celery worker containers is supported; PostgreSQL and Redis use standard cloud-managed variants in production deployments.

Container	Port	Role
api	8000	FastAPI application server — REST, WebSocket, SSE
celery_worker	—	AI pipeline, email dispatch, file cleanup tasks
celery_beat	—	Scheduled tasks: lock expiry, link expiry, digests
flower	5555	Celery task monitoring and management dashboard
postgres	5432	Primary data store with vector extension
redis	6379	Session cache, task queue, rate limit counters
minio	9000/9001	S3-compatible object storage + admin console

Key Environment Variables

Variable	Example	Description
DATABASE_URL	postgresql+asyncpg://user:pass@postgres:5432/vyxlo	Async SQLAlchemy connection string
REDIS_URL	redis://redis:6379/0	Redis connection for cache and task broker
MINIO_ENDPOINT	minio:9000	MinIO server host:port
MINIO_ACCESS_KEY	minioadmin	MinIO access key
MINIO_SECRET_KEY	••••••••	MinIO secret key
MINIO_BUCKET	vyxlo-documents	Bucket name for all uploaded files
ZITADEL_DOMAIN	auth.example.com	ZITADEL issuer domain for OIDC discovery
ZITADEL_CLIENT_ID	123456789@vyxlo	OAuth 2.0 client ID registered in ZITADEL
OPENAI_API_KEY	sk-…	OpenAI key for classification + embeddings
ANTHROPIC_API_KEY	sk-ant-…	Anthropic key for summarization + Q&A
LLM_PROVIDER	openai	Active LLM backend: openai \| anthropic \| gemini \| ollama
CELERY_BROKER_URL	redis://redis:6379/1	Celery task broker (Redis DB 1)
SECRET_KEY	••••••••	Application secret for token signing
CORS_ORIGINS	https://app.example.com	Comma-separated allowed CORS origins
MAX_UPLOAD_SIZE_MB	100	Per-upload size cap (org-level override available)
PRESIGNED_URL_EXPIRY_S	900	MinIO presigned download URL TTL in seconds

// Search Architecture

Dual-Mode Search Engine

searchMode 01

Full-Text Search (tsvector)

PostgreSQL's native full-text search using GIN-indexed tsvector columns. Supports BM25-ranked keyword queries with phrase matching, prefix matching, and stemming. Queries can be scoped by folder, filtered by document status, filtered by document type, and sorted by relevance or date. Results return document metadata, AI summary, and matched keyword highlight snippets. Typical latency under 10ms on a corpus of 100,000+ documents.

GET /documents/search?q=annual+report&status=PUBLISHED

&folder_id=5&doc_type=FINANCIAL

hubMode 02

Semantic Search (pgvector)

The user's query is embedded into a 1536-dimension vector using the same embedding model used during document indexing. pgvector computes cosine similarity between the query vector and all indexed document chunk vectors using the HNSW approximate nearest-neighbor index. Results surface conceptually related documents even when no exact keywords are shared — finding "revenue projections" when the user queries "sales forecast", for example. The API returns the top-K results sorted by cosine similarity score.

POST /documents/semantic-search

{ "query": "quarterly sales forecast", "top_k": 10 }

// Performance Characteristics

Non-Functional Requirements

VyXlo publishes documented non-functional targets — not aspirational marketing claims, but verified benchmarks with measurement methodology. The API p95 latency target of < 200ms applies to all synchronous endpoints. AI processing tasks run asynchronously via Celery and complete the full 7-step pipeline (extraction → classification → summarization → keywords → entities → embedding → chunk indexing) in under five minutes for documents up to 100 pages.

The test suite covers 327+ test cases including unit tests for service layer functions, integration tests against a real PostgreSQL + Redis + MinIO stack, and end-to-end API tests. Line coverage is maintained at or above 72%. Prometheus metrics export request latency histograms, error rates, Celery task throughput, and queue depth — compatible with Grafana dashboards and any alerting platform that supports the Prometheus scrape protocol.

Metric	Target	Scope
API p95 response time	< 200 ms	Non-AI synchronous endpoints
AI pipeline (async)	< 5 min	Celery task from upload to INDEXED
SSE first token latency	< 2 s	RAG Q&A first streamed token
Search query latency	< 50 ms	tsvector + pgvector combined
Test suite	327+ cases	Unit + integration + E2E
Line coverage	≥ 72%	Backend Python (measured via pytest-cov)
Presigned URL TTL	15 min (900 s)	MinIO download token expiry
Document lock expiry	1 hour	Exclusive write-lock auto-release

// Partner Integration

Integration Patterns

webhook

API-First Integration

Consume the full 64-endpoint REST API from any language. All endpoints are documented with OpenAPI 3.1 at /docs (ReDoc) and /openapi.json. Authenticate using ZITADEL service accounts for backend integrations, or PKCE flows for user-facing applications. SDKs for Python and TypeScript are available.

RESTOpenAPI 3.1PKCEService Account

schema

Embedded Document Layer

Use VyXlo as the document intelligence backend for your existing platform. Upload documents via API, consume AI extraction results, and render document search and Q&A within your own UI. White-label friendly — org-level branding, custom domains, and CORS configuration per deployment.

White-labelCustom DomainCORSOrg Branding

hub

Event-Driven Pipelines

Build automated document pipelines: ingest from external sources, trigger classification via API, read AI results, route to approval workflows, and archive on completion. The Celery task queue and Flower monitoring dashboard give full visibility into async pipeline health.

CeleryFlower :5555AsyncWebhooks

< 200ms

p95 API Response

< 5 min

AI Processing (async)