Backend Services & Pipelines (BKP)¶
Module Purpose: The engine room of the Pebble Orchestrator. This module defines the technical "work items" required to build a secure, multi-tenant backend and highly reliable data processing pipelines.
Use Case Quick Reference¶
| ID | Title | Priority |
|---|---|---|
| Backend API | ||
| US-BKP-001 | Secure JWT Auth Handshake | P1 |
| US-BKP-002 | Multi-tenant Data Isolation | P1 |
| US-BKP-003 | Standardized Ingestion Endpoint | P1 |
| Data Pipelines | ||
| US-BKP-004 | Async Email Processing Queue | P1 |
| US-BKP-005 | Sync Failure Dead Letter Queue | P1 |
| US-BKP-006 | Idempotent Card Creation | P1 |
| US-BKP-007 | Unified Semantic Indexing | P2 |
| US-BKP-008 | Automated Insight Synthesis | P2 |
US-BKP-001: Secure JWT Auth Handshake¶
Description: As a Backend Developer, I want to implement a JWT-based authentication handshake so that all API requests are securely authorized and tied to a specific user session.
Acceptance Criteria:
- [ ] Implement POST /api/v1/token/ to exchange credentials for Access and Refresh tokens.
- [ ] Implement POST /api/v1/token/refresh/ for session extension.
- [ ] All v1 endpoints (except login) must return 401 Unauthorized if no valid token is present.
- [ ] Tokens must contain user_id, tenant_id, and role claims.
US-BKP-002: Multi-tenant Data Isolation¶
Description: As a Lead Architect, I want a middleware-level data isolation layer so that no user can ever access data belonging to a different legal entity (tenant) by accident.
Acceptance Criteria:
- [ ] Implement a Django middleware that extracts tenant_id from the JWT.
- [ ] Force all database queries to include a WHERE tenant_id = XYZ filter automatically.
- [ ] Alpha Pebble Playbook: Inject Context Layer metadata (source_system, correlation_id) into every database record for cross-silo traceability.
- [ ] Unit tests must verify that a query for Company records returns zero results if the tenant_id is mismatched.
US-BKP-003: Standardized Ingestion Endpoint¶
Description: As an Integration Developer, I want a single, hardened endpoint to receive raw email data so that external listeners can hand off data without knowing about internal processing logic.
Acceptance Criteria:
- [ ] POST /api/v1/ingestion/email/ accepts JSON with subject, from, body, and list of attachment_urls.
- [ ] Endpoint validates that the sender email is not on the global blacklist.
- [ ] On success, returns 202 Accepted with a tracking_id.
US-BKP-004: Async Email Processing Queue¶
Description: As a DevOps Engineer, I want to process incoming emails asynchronously using a Redis-backed queue so that the ingestion API remains responsive even during high-traffic bursts.
Acceptance Criteria:
- [ ] API successful ingestion must push a task to the email_processing queue in Redis.
- [ ] Celery worker picks up task and triggers AI Classification.
- [ ] System handles retries for the first 3 failures with exponential backoff.
US-BKP-005: Sync Failure Dead Letter Queue¶
Description: As a DevOps Engineer, I want emails that fail processing multiple times to be moved to a Dead Letter Queue so that they can be manually inspected without blocking the main pipeline.
Acceptance Criteria:
- [ ] After 3 failed retries, the task is moved to email_dlq.
- [ ] A record is created in the SyncError table with the full traceback and raw payload.
- [ ] IT dashboard (OPS-006) displays a count of items in the DLQ.
US-BKP-006: Idempotent Card Creation¶
Description: As a Data Engineer, I want to ensure that every unique email results in exactly one Kanban card so that users are not overwhelmed by duplicate leads from retry logic.
Acceptance Criteria:
- [ ] Hash the Message-ID header of the incoming email.
- [ ] Check if hash exists in the ProcessedEmail table before creating a new card.
- [ ] If duplicate detected, log a "Duplicate Ignored" event and terminate the pipeline for that item.
US-BKP-007: Unified Semantic Indexing¶
Description: As a Lead Data Engineer, I want a background service that creates a unified semantic index of all business objects (Emails, Cards, CRM/ERP records) so that users can perform Contextual Search.
Acceptance Criteria:
- [ ] Implement a background worker that generates Vector Embeddings for new ActivityStream events.
- [ ] Index unstructured email bodies and structured CRM/ERP metadata into a common vector store.
- [ ] Expose GET /api/v1/search/unified?q=query for fuzzy/semantic retrieval across the entire enterprise context.
US-BKP-008: Automated Insight Synthesis¶
Description: As a Business Analyst, I want a background service that periodically scans the ActivityStream to synthesize high-level insights (e.g., "Customer X sentiment is declining due to repeated stock-outs") so that I can take proactive action.
Acceptance Criteria:
- [ ] Implement a weekly worker that aggregates events per tenant_id and entity_id.
- [ ] Uses LLM (Ollama/OpenAI) to generate periodic summaries and "Health Scores".
- [ ] Stores insights in the InsightSynthesis table for display on the Manager Dashboard.