System Diagnostics

Troubleshooting & Support

Resolution protocols for common infrastructure anomalies, HTTP error codes, performance characteristics, and direct engineering support.

Systems Operational

warning
HTTP Error Reference

401Unauthorized

OIDC JWT is missing, expired, or has an invalid signature. All non-PUBLIC endpoints require Authorization: Bearer <token>. Validate that ZITADEL_ISSUER and ZITADEL_AUDIENCE are correctly set and that the token has not expired.

Authorization: Bearer <valid_oidc_jwt_from_zitadel>

403Forbidden — Insufficient Permission

The authenticated user does not have the required resource-level permission or system role for this action. Check the user's effective permission on the resource via GET /api/v1/permissions/my/document/{id}.

GET /api/v1/permissions/my/document/{id} # → {"permission_type": "READ"} — user needs WRITE or higher

404Resource Not Found

The requested resource does not exist, has been soft-deleted, or belongs to a different organization. Soft-deleted resources are not returned in standard queries — only an ADMIN can recover them.

409Document Lock Conflict

Another user holds the exclusive write-lock on this document. Locks auto-expire after 1 hour. Check locked_by_id and locked_at fields on the document object. ADMIN can forcibly release the lock via POST /documents/{id}/unlock.

{ "is_locked": true, "locked_by_id": 12, "locked_at": "2026-04-21T08:15:00Z" // Auto-expires at 09:15:00Z }

413Payload Too Large

The file exceeds the organization's configured upload size limit. File bytes are streamed directly to MinIO — the API server never buffers file content in memory. Increase the org storage quota via PUT /organizations/me or split large files.

422Validation Error

The request body or query parameters failed Pydantic v2 validation. The response detail array lists each failing field with the specific constraint violated.

{ "detail": [ { "loc": ["body", "title"], "msg": "field required", "type": "missing" } ] }

429Rate Limited

The request was rejected by slowapi rate limiting. Authentication and sensitive endpoints have stricter limits. The Retry-After header indicates when the rate limit window resets.

Retry-After: 30 // seconds until next allowed request

504Vector Search Timeout

Semantic search exceeded the maximum allowed latency. This typically indicates missing or sub-optimal HNSW indexing on large vector collections. Run EXPLAIN ANALYZE against the embedding similarity query and rebuild the index.

SET hnsw.ef_search = 100; CREATE INDEX CONCURRENTLY ON documents USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

Performance Characteristics

Non-functional requirements and system guarantees under normal production load.

query_stats

Characteristic	Detail
API Response Time	< 200 ms p95 for all non-AI endpoints under normal load
File Upload	Streamed directly to MinIO — API server does not buffer file bytes in memory
File Download	Presigned URLs (15-min expiry) — zero API server bandwidth for file content
AI Processing	Classification and summarization complete within 5 minutes of upload (async Celery)
WebSocket	Persistent connection per document session; heartbeat-based keep-alive
Permission Cache	Redis cache for permission lookups; invalidated immediately on mutation
Rate Limiting	Configurable per-endpoint rate limits via slowapi; strictest on auth endpoints
Test Coverage	327+ tests; ≥ 72% line coverage across the backend service layer
DB Migrations	Alembic-managed; zero-downtime via expand/contract migration pattern
Scalability	API and Celery worker tiers are stateless and horizontally scalable
Observability	Structured JSON logs (structlog); Prometheus metrics; Celery monitoring via Flower (:5555)

< 200ms

p95 API latency

327+

Test cases

≥ 72%

Line coverage

Performance Tuning

PostgreSQL / pgvector and infrastructure optimization recommendations.

HNSW vs IVFFlat Index

Use HNSW for production (better recall on semantic queries). Use IVFFlat for faster initial index builds on large datasets during migration.

Worker Parallelism

Set max_parallel_workers_per_gather to 50% of CPU cores for vector similarity operations. Monitor via EXPLAIN ANALYZE.

Redis Eviction Policy

Set maxmemory-policy to allkeys-lru. Permission cache entries are re-populated on the next request — brief cache misses are non-fatal.

Celery Concurrency

Scale celery_worker replicas independently from the API tier. AI processing is CPU-bound; allocate dedicated worker pods for large document volumes.

-- Optimize pgvector semantic search
SET hnsw.ef_search = 100;

-- Rebuild index concurrently (zero downtime)
CREATE INDEX CONCURRENTLY ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Verify query plan
EXPLAIN ANALYZE SELECT id, title
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'
LIMIT 10;

Health Endpoints

Use for load balancer health checks and Kubernetes liveness/readiness probes.

GET/api/v1/healthLiveness probe

{"status": "ok"}

GET/api/v1/health/readyReadiness probe

{ "database": "ok", "redis": "ok", "minio": "ok" }

Direct Support

Cannot resolve through documentation? Open a high-priority ticket with our core engineers.

confirmation_numberOpen Support Ticket

Avg: 14m responseTier 1 Active

Technical Resources

Observability Endpoints

/api/v1/healthLiveness

/api/v1/health/readyReadiness

/metricsPrometheus

:5555Flower (Celery)

/api/docsSwagger UI

warningHTTP Error Reference