Every feature built for
zero-error environments.
VyXlo CSP is not a folder system with AI bolted on. Every capability was designed from first principles to handle institutional-grade document workflows — with a 7-step AI processing pipeline, an 8-level permission model applied independently per resource, immutable audit trails, and a 64-endpoint REST API that exposes every operation programmatically.
AI Extraction Suite
Every document uploaded to VyXlo runs through a seven-step asynchronous AI processing pipeline executed by Celery workers. The pipeline is triggered automatically on upload (configurable via AI_PROCESS_ON_UPLOAD) or on-demand via the API. Classification and summarization complete within five minutes of upload, after which AI fields are permanently stored on the document object.
The AI layer is provider-agnostic. VyXlo routes through LangChain, allowing you to swap between OpenAI (GPT-4 + Embeddings), Anthropic (Claude), Google Gemini, or fully local Ollama models for air-gapped deployments — with no code changes required. Toggle the feature per organization via the ENABLE_AI_FEATURES flag.
Document Management
VyXlo treats every document as a first-class entity with a controlled lifecycle, full version history, and comprehensive metadata model. Documents are never destroyed silently — soft-delete architecture ensures every file remains recoverable by an administrator regardless of user action.
Every document carries: title, description, document type, department, folder assignment, custom JSONB key-value pairs, tags, language, page count, word count, and a SHA-256 checksum stored at upload time. The checksum is validated on subsequent downloads to detect file corruption or tampering at rest. Usage analytics — view count, download count, and comment count — are tracked automatically and returned in every document API response.
Documents move through a validated status lifecycle: DRAFT → IN_REVIEW → PENDING_APPROVAL → APPROVED → PUBLISHED → ARCHIVED. Every status transition is validated server-side and rejected with a descriptive error if invalid. Each change generates an immutable audit record. Administrators can also set per-document retention dates, enforced automatically by a Celery Beat scheduled task.
Version History
Every upload to an existing document creates a new numbered version. All prior versions remain downloadable indefinitely — versions are never deleted. Version metadata (size, checksum, upload timestamp) is tracked per version.
Document Locking
Any user with edit permission can claim an exclusive write-lock via POST /documents/{id}/lock. The lock prevents concurrent edits by returning 409 Conflict to others. Locks auto-expire after one hour; administrators can release any lock manually.
Soft Delete & Recovery
DELETE /documents/{id} performs a soft delete — the document becomes invisible in listings and search results but remains in the database. Administrators can recover any soft-deleted document. No data is permanently destroyed without an explicit, intentional ADMIN action.
Retention Policies
Administrators set per-document retention_until dates via the API. A Celery Beat scheduled task enforces expiry automatically — when a document passes its retention date, it is flagged accordingly and no longer served in standard queries.
Folder Hierarchy
Arbitrary-depth folder trees scoped per organization, stored using materialized path notation (/1/5/12/) for efficient subtree queries. Move operations include cycle detection. Deleting a folder cascades soft-deletes to all children and their documents.
Rich Metadata & Tags
Documents support custom JSONB key-value metadata for any domain-specific fields beyond the standard schema. A tag library (per-organization, with autocomplete API) allows multi-tag assignment and removal in a single operation. Filter by tag in both full-text and semantic search.
Starring & Bookmarks
Users can star any accessible document. The GET /documents/starred endpoint returns the current user's bookmarked list paginated. Starred status is included in the document object response (is_starred field) for UI state management.
Share Links
Generate cryptographically signed, time-limited share tokens for external access. Optional password protection and optional email restriction (only the specified address may use the link). Access analytics per link. No user account required to access a share link (configurable per link). Token validation always hits the database directly — never cached.
Dual Search Architecture
Full-Text Search
Powered by PostgreSQL tsvector index — the same engine that powers production-grade search at scale. Full-text queries match on document title, description, extracted AI keywords, and body text simultaneously. Support for folder scoping (restrict search to a subtree), status filtering (search only APPROVED documents), and document type filtering (show only FINANCIAL or LEGAL documents).
Results are ranked by relevance using PostgreSQL's built-in ts_rank function. Paginated response with standard page and size parameters. Every result includes the full document object with AI fields, so no secondary fetch is required to display classification or summary in search results.
Semantic Vector Search
Powered by pgvector with HNSW (Hierarchical Navigable Small World) indexing. When a user submits a semantic search query, it is first embedded using the same model that processed documents — producing a 1536-dimension vector. A cosine-distance query against all document embeddings in the organization returns conceptually related results even when no exact keyword matches exist.
This means a query for "financial performance last quarter" will surface a document titled "Q4 Board Review" that never uses those exact words. The HNSW index provides approximate nearest neighbor retrieval with high recall at sub-linear query time. The index type can be tuned via hnsw.ef_search for the recall/latency trade-off appropriate to your dataset size.
Choosing between modes:Full-text search is ideal for exact document retrieval (find the specific "Q4 2025 Financials" file), compliance audit queries (find all APPROVED documents in the Legal folder), and structured filtering workflows. Semantic search is ideal for knowledge discovery (what documents discuss supplier risk mitigation?), research across large document collections, and natural language queries from non-technical users. Both search endpoints are paginated and both return the full document object — including AI classification, summary, and tag assignments — so search results can be rendered with full context without additional API calls.
Approval Workflow Engine
VyXlo's workflow engine transforms unstructured document review into a deterministic, auditable approval chain. Every step generates an immutable audit record so there is a complete, tamper-proof trail of who approved or rejected what, when, and why — meeting the evidentiary requirements of regulated industries.
Workflows are created against one or more documents simultaneously. Once created, each document's status transitions from its current state toward PENDING_APPROVAL. The workflow object exposes all steps with their current status: PENDING, APPROVED, or REJECTED. When all steps in a workflow are APPROVED, the document is automatically promoted to APPROVED status.
Assignees can be individual users or entire departments. When a department is assigned, all current and future members of that department can action the workflow step — making the workflow resilient to personnel changes. Reject actions require a mandatory written reason, creating a documented rationale that satisfies audit requirements. Any workflow can be fully cancelled via the DELETE endpoint, reverting the document to its prior status.
Sequential Chains
Steps execute one after another. Step 2 does not become active until Step 1 is approved. Ideal for hierarchical approval processes: reviewer → manager → legal → executive sign-off.
Parallel Nodes
Multiple steps become active simultaneously. All must resolve before the workflow advances. Ideal when multiple departments must independently approve before a document is published.
Per-Step Assignees
Each step can be assigned to a specific user or an entire department. Department assignments are dynamic — any member of the department can action the step regardless of when they joined.
Approve with Comment
Approvers can attach an optional comment to their approval decision. Stored on the workflow step object for audit purposes. Comments are returned in workflow API responses.
Reject with Reason
Rejection requires a mandatory reason string. This creates a documented rationale for every rejection, providing the audit trail required for compliance in regulated document management.
Deadline Tracking
Each step tracks when it was created and when it was decided. Overdue steps are identifiable programmatically. Auto-escalation can be triggered by monitoring step age via the Celery Beat scheduler.
Real-Time Collaboration
WebSocket — Live Presence & Locking
Each document has a dedicated WebSocket channel at ws://host/api/v1/ws/{document_id}?token=<jwt>. The connection is authenticated with the same OIDC JWT used for REST calls — no separate session management required. The server broadcasts typed events to all connected clients: presence_join when a user opens the document, presence_leave when they close it, cursor position updates for collaborative editing awareness, lock_acquired when any user claims the write-lock, and lock_released when it is freed. This means every connected client always has a live, accurate view of who is present and whether the document is editable — without polling.
Comments — Threaded & Resolved
Comments are attached directly to documents and support unlimited reply nesting. Any comment thread can be resolved by the document owner or an administrator, marking it as closed and visually collapsing it for other reviewers. Emoji reactions can be added and removed on any comment. Comment authors can edit their own comments; administrators can delete any comment. The comment list endpoint returns threaded structure (replies nested under their parent) so frontend rendering requires no secondary queries. Each comment and reply is part of the immutable audit record.
Notifications — In-App & Email
Event-driven notifications are generated automatically for: new comments on documents you own or follow, workflow step assignments, approval and rejection decisions on workflows you created, share link accesses on your documents, and @mentions in comments. The notification center provides an unread count badge endpoint (GET /notifications/unread-count) ideal for real-time badge updates in navigation UI. Individual notifications can be marked as read. Email delivery and daily digest configuration are toggleable per organization.
Document Q&A — RAG via SSE
Once a document is chunk-indexed, users can have a conversational Q&A session against its content via POST /chat/document/{id}. The endpoint uses Server-Sent Events to stream the LLM response token-by-token, enabling progressive rendering in the UI without waiting for the full response. Sessions are persisted — users can resume prior conversations by passing the session_id. The final done event includes source citations pointing to the exact document chunks used to generate the answer.
Security & Compliance
VyXlo stores no passwords. All identity is delegated to ZITADEL — an enterprise-grade open-source identity platform — using the OAuth 2.0 Authorization Code flow with PKCE. ZITADEL handles MFA (TOTP and SMS), passkeys, WebAuthn, social login, enterprise SSO via SAML 2.0, and LDAP/Active Directory federation. VyXlo validates OIDC JWTs against the ZITADEL JWKS endpoint on every request.
Authorization is layered. Every API endpoint enforces (1) valid authentication, (2) organization isolation — all queries are scoped to the authenticated user's organization and this filter is never optional, (3) role-based minimum role requirements per endpoint, and (4) resource-level permission checks before any data is returned or mutated.
The permission model has eight levels — NONE, READ, DOWNLOAD, COMMENT, CONTRIBUTOR, WRITE, EDITOR, ADMIN — applied independently per document and per folder. Permissions can target individual users or entire departments (all current and future members inherit). Optional expiry dates enable time-bounded access grants. Permission cache in Redis is invalidated immediately on any mutation.
Full Security Architecture arrow_forwardZITADEL OIDC / PKCE
Zero passwords in VyXlo. MFA, passkeys, SAML, LDAP supported. JWT validated against ZITADEL JWKS endpoint on every request.
Multi-Tenant Isolation
Shared DB with org_id enforced at ORM level. Cross-org data leakage is architecturally impossible — the filter is embedded in every service layer query.
8-Level Permissions
NONE / READ / DOWNLOAD / COMMENT / CONTRIBUTOR / WRITE / EDITOR / ADMIN applied per document and per folder independently. Grants to users or departments with optional expiry.
7-Level Role Hierarchy
SUPER_ADMIN(100) → ADMIN(80) → MANAGER(60) → EDITOR(40) → USER(20) → VIEWER(10) → GUEST(5). Hierarchical — each role inherits capabilities of all roles below it.
Immutable Audit Trail
8 event types (CREATE, UPDATE, DELETE, ACCESS, DOWNLOAD, PERMISSION_CHANGE, WORKFLOW, EXPORT) with actor, resource, timestamp, IP address, and before/after diff.
MinIO Presigned Downloads
File bytes never proxy through the API server. Downloads return 15-minute presigned MinIO URLs. SHA-256 checksums detect corruption or tampering.
Soft-Delete Architecture
No data permanently destroyed without explicit ADMIN action. All deletes are soft. Retention dates enforced by Celery Beat scheduler. Data recoverable by administrators.
Rate Limiting & CORS
Per-endpoint rate limits via slowapi. Strictest limits on authentication endpoints. CORS policy enforced with configurable allowed origins via ALLOWED_ORIGINS env var.
REST API — 64 Endpoints
The VyXlo API exposes 64 REST endpoints across 18 resource groups, covering every capability available in the product. Every operation that can be performed in the UI can be performed via API — enabling full automation, custom frontends, mobile applications, and system integrations without any undocumented surface area.
All endpoints are prefixed with /api/v1/ and return application/json. Pagination uses page (1-based) and size query parameters, returning a consistent envelope: { items, total, page, page_size, pages }. All timestamps are ISO 8601 UTC. All IDs are 64-bit integers. Errors return { "detail": "..." } with the appropriate HTTP status code.
Interactive Swagger UI is available at /api/docs on any deployment. ReDoc documentation at /api/redoc. The full OpenAPI 3.0 JSON schema is available at /api/openapi.json for client generation in any language.
Built on production-proven infrastructure.
Ready to deploy?
The full stack runs with a single docker compose up -d. All 64 API endpoints, the complete AI pipeline, and real-time collaboration are available from day one.