Features & Capabilities
Technical reference for the VyXlo CSP document lifecycle, permission architecture, role hierarchy, AI processing pipeline, workflow engine, and real-time collaboration layer.
Document Lifecycle
Every document follows a controlled status progression. Transitions are validated server-side — invalid moves are rejected with a descriptive error. Every status change generates an immutable audit record.
Version History
Every upload creates a new version. Prior versions are always downloadable and never deleted.
Document Locking
Exclusive write-lock prevents concurrent edits. Auto-expires after 1 hour. Returns 409 Conflict if already locked.
Soft Delete
Deleted documents are invisible in listings but recoverable by an ADMIN. No data is permanently destroyed without explicit ADMIN action.
Retention Policies
Per-document retention dates enforced by ADMIN. Automated expiry enforcement via Celery Beat scheduled tasks.
SHA-256 Checksums
Document checksums stored on every upload to detect file corruption or tampering at rest.
Usage Analytics
View count, download count, and comment count tracked per document and returned in every API response.
System Role Hierarchy
Roles are hierarchical — each role inherits all capabilities of roles below it. Minimum role requirements are enforced per API endpoint. Roles are org-scoped except SUPER_ADMIN which spans all organizations.
| Role | Level | Typical Use |
|---|---|---|
| SUPER_ADMIN | 100 | Platform operator — cross-organization access and management |
| ADMIN | 80 | Organization administrator — full org control, user management |
| MANAGER | 60 | Department head — workflow approver, user oversight |
| EDITOR | 40 | Power content creator — broad document operations |
| USER | 20 | Standard knowledge worker — everyday document operations |
| VIEWER | 10 | Read-only stakeholder — view and download only |
| GUEST | 5 | External collaborator via invite or share link |
8-Level Permission Model
Applied independently at document and folder level. Permissions can be granted to individual users or entire departments. Optional expiry dates allow time-bounded access. Permission changes take effect immediately with Redis cache invalidation.
Key rule: A VIEWER-role user can be granted WRITE permission on a specific folder — resource-level permissions override system role capabilities for that specific resource.
| Level | View | Download | Comment | Edit | Delete | Share | Manage |
|---|---|---|---|---|---|---|---|
| NONE | — | — | — | — | — | — | — |
| READ | ✓ | — | — | — | — | — | — |
| DOWNLOAD | ✓ | ✓ | — | — | — | — | — |
| COMMENT | ✓ | ✓ | ✓ | — | — | — | — |
| CONTRIBUTOR | ✓ | ✓ | ✓ | ✓ | — | — | — |
| WRITE | ✓ | ✓ | ✓ | ✓ | — | ✓ | — |
| EDITOR | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| ADMIN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
AI Processing Pipeline
Triggered automatically on document upload (configurable via AI_PROCESS_ON_UPLOAD) or on-demand via POST /ai/reindex/{id}. Runs as a Celery async task. Classification and summarization complete within 5 minutes of upload.
Text Extraction
pdfplumber / python-docxPDF via pdfplumber; DOCX via python-docx; raw text for plain text files
Classification
LangChain + LLMAssign a document type classification with a confidence score (0–1)
Summarization
LangChain + LLMGenerate a concise natural-language summary of the document
Keyword Extraction
LangChain + LLMIdentify the most relevant terms in the document
Entity Extraction
LangChain + LLMIdentify people, organizations, dates, and locations
Embedding Generation
OpenAI EmbeddingsCreate a 1536-dimension vector embedding for semantic search indexing
Chunk & Index
pgvector + tiktokenSplit into overlapping chunks, embed each chunk, store in pgvector for Q&A retrieval
AI Fields on Document Object
| Field | Type | Description |
|---|---|---|
| ai_processed | boolean | Whether AI processing has completed |
| ai_classification | string | Assigned document category (e.g., FINANCIAL_REPORT, CONTRACT, LEGAL) |
| ai_confidence | float | Classification confidence score (0–1) |
| ai_summary | string | Natural language summary of document content |
| ai_keywords | string[] | Extracted keyword list |
| ai_entities | object | Extracted named entities: people, organizations, dates, locations |
| chunk_index_status | string | NOT_INDEXED | QUEUED | INDEXED | FAILED |
| chunk_indexed_at | datetime | Timestamp of last successful chunk indexing |
AI Provider Support
Pluggable LLM backends via LangChain. Toggle AI features per organization via feature flag ENABLE_AI_FEATURES.
OpenAI (GPT-4)
Classification, summarization, extraction, semantic embeddings
OPENAI_API_KEYAnthropic (Claude)
Classification, summarization, document Q&A
ANTHROPIC_API_KEYGoogle Gemini
Classification, summarization
GOOGLE_API_KEYOllama (local models)
On-premises / air-gapped deployments
OLLAMA_BASE_URLWorkflow Engine
Sequential Chains
Linear A → B → C approval logic. Each step must be approved before the next activates.
Parallel Nodes
Simultaneous multi-department reviews. Steps proceed concurrently and all must resolve before document advances.
Per-Step Assignees
Assign individual users or entire departments to each step. Department assignment inherits to all current and future members.
Escalation & Overdue
Deadline tracking with automated escalation triggers when step deadlines are exceeded. Full workflow cancellation available at any step.
Dual Search Architecture
Full-text and semantic search operate in parallel — choose the right mode per query.
Full-Text Search
PostgreSQL tsvector index. Supports keyword queries, folder scoping, status filtering, and document type filtering.
Semantic Search
pgvector cosine-distance against 1536-dimension embeddings. Finds conceptually related documents without exact keyword matches.
Real-Time Collaboration
Live Presence
See who is currently viewing or editing a document in real time. WebSocket broadcasts join/leave events with display names.
Cursor Tracking
Share cursor position updates between concurrent viewers and editors via WebSocket cursor events.
Lock Notifications
Real-time broadcast when a document lock is acquired or released. Prevents concurrent edit conflicts.
Threaded Comments
Nested reply chains on documents. Comment resolution by document owner or ADMIN. Emoji reactions on comments.
Share Links
Cryptographically signed, time-limited share tokens. Optional password protection and email restriction. No user account required.
Notifications
In-app notification center with unread count badge. Events: document comments, workflow assignments, approval decisions, @mentions, share accesses.
Immutable Audit Trail
Every significant action generates an immutable audit record capturing: actor, action, target resource, timestamp, before/after diff (for UPDATE events), and IP address.