Data Pipelines & Reliability¶
This document describes the flow of critical data across the Pebble Orchestrator, highlighting asynchronous processing steps and reliability strategies.
Core Pipelines¶
1. Email Ingestion Pipeline (Sync → Async)¶
The journey from an Outlook Mailbox to a Kanban Card.
graph LR
A[O365 Mailbox] -->|Webhook| B[Pebble Listener]
B -->|Push| C[(Redis Queue)]
C -->|Worker 1| D[Metadata Extraction]
D -->|Store| E[(PostgreSQL)]
E -->|Worker 2| F[AI Classifier]
F -->|API| G[Plane.so Card Creation]
2. CRM Master Sync Pipeline (Event-Driven)¶
Bi-directional synchronization between the Kanban board and CRM Master.
- Trigger: User moves card from Qualify to Quote in Plane.so.
- Action:
1. Plane fires Webhook to Pebble Integration Bus.
2. Integration Bus validates CRM state (e.g., Check if GSTIN exists).
3. Integration Bus updates Company Status in CRM Master to "Qualified".
4. Triggers AI Assistant to suggest "Landed Cost" for this quote.
3. AI Enrichment Pipeline¶
Enhancing cards with deep intelligence. - Steps: 1. OCR Stage: Scans attachments for CIN/PAN/GST. 2. Sentiment Stage: Grades lead urgency (P0-P3). 3. Intent Stage: Maps enquiry to specific Product DNA synonyms.
Reliability & Fault Tolerance¶
Dead Letter Queues (DLQ)¶
Any message failing > 3 times is moved to a DLQ.
- Alert: IT team notified via OPS-006.
- Action: Manual inspection or "Re-ingest" after fixing the upstream issue (e.g., ERP API downtime).
Idempotency¶
To prevent "Duplicate Lead Burnout", every email message ID is hashed and cached in Redis. If the same ID is seen again, the pipeline drops it before processing.
Sync Retries¶
When syncing with external ERPs (Focus RT/Business Central):
- Strategy: Exponential backoff (1m, 5m, 15m, 1h).
- Hard Limit: After 24 hours of failure, the card status in Kanban is updated to SYNC_ERROR.
Pipeline User Stories (DevOps)¶
US-DVO-005: Pipeline Pulse Monitoring¶
As a DevOps Engineer, I want to see real-time queue depths so that I can scale worker pods before ingestion lag impacts sales. - AC: Grafana dashboard showing Redis queue length and worker processing time.
US-DVO-006: Bulk Re-classification¶
As an Admin, I want to re-run the classification engine on a batch of cards so that I can update the board after a model retraining. - AC: Management command/API to trigger batch re-classification for a specific date range.