AI & ML Overview¶

This document describes the AI and machine learning components used in the Pebble Orchestrator (Pebble IQ) for intent recognition, entity extraction, and process intelligence.

[!WARNING] Dynamic Implementation Scope: Please note that specific AI models, parameters, and architectural decisions outlined in this document are subject to evolution during the execution phase. Final implementation choices will be made based on real-world content matching, validation against architectural constraints, and estimation reality, rather than strictly adhering to the initial specifications listed here.

Design Philosophy: Augment, Not Replace¶

[!IMPORTANT] Pebble IQ is designed to empower staff, not eliminate them. The system provides AI-powered decision support while keeping humans in control.

Why Augmentation Over Automation?¶

Business Rationale:

Domain Expertise: Chemical manufacturing requires deep technical knowledge (batch specs, compliance, pricing nuances) that AI cannot fully replicate
Customer Relationships: B2B sales rely on personal trust and context that only long-term staff relationships provide
Risk Management: High-value transactions (₹lakhs) require human judgment for edge cases and exceptions
Staff Morale: Teams want to be empowered with better tools, not replaced by them

Technical Rationale:

Confidence Limits: Even SOTA LLMs have 10-30% uncertainty on complex B2B queries
Context Gaps: AI lacks access to unwritten business rules, internal politics, and customer history beyond data
Accountability: Customers expect human accountability for pricing, commitments, and compliance

Implementation Impact:

This philosophy increases system complexity and effort because we must build:

Confidence scoring & thresholding logic
Human review/approval workflows
Override & undo mechanisms
Audit trails for human decisions

A fully automated system (no human approval) would be less work but deliver less value in this B2B context.

AI Capabilities Summary¶

Capability	Model Type	Purpose
Intent Recognition	Transformer (BERT/LLM)	Routing emails to Sales, Ops, or Tenders
Business Entity Extraction	NER / LLM	Extracting GST, PAN, CIN, and Order IDs
Tender IQ (OCR)	Deep Learning	Extracting NIT details and dates from PDFs
Sentiment & Saliency	NLU	Identifying high-priority/urgent leads
Pebble IQ (RAG)	Embeddings + Vector DB	Semantic search across Emails, Docs, and CRM

Execution Context: Cloud vs. Local¶

Pebble supports both cloud-native and private, on-premise AI execution.

Context	Recommended Stack	Best For
Local (Primary)	Ollama (Llama 3 / Mistral)	Default for POC. High privacy, zero per-token cost, offline ops.
Cloud (Fallback)	Azure OpenAI / Google Gemini	Optional fallback if local hardware is insufficient.

1. Agentic Triage & Routing¶

Purpose¶

Automatically analyze and route incoming emails into specific operational streams (Sales, Logistics, Quality, Tenders) using an intelligent agent loop.

Model Hierarchy¶

graph TD
    A["Incoming Email"] --> B{"Agentic Router"}
    B --> C["Sales"]
    B --> D["Logistics/Dispatch"]
    B --> E["Tender/NIT"]
    B --> F["Unclassified"]

    C --> C1["New Quote"]
    C --> C2["Follow-up"]

    D --> D1["Tracking Query"]
    D --> D2["Dispatch Proof"]

Model Options¶

Model	Type	Best For
LLM Agent	Generative (Llama3/GPT-4)	Few-shot reasoning with RAG context (POC Default). Critical for accuracy.
RoBERTa-base	Transformer	High accuracy, low latency (Optimization phase)
SetFit	Few-Shot	Training on very small datasets (<50 samples)

2. Business Entity Extraction (NER)¶

Purpose¶

Extract structured business identifiers to enable one-click synchronization with CRM and Focus RT.

Critical Entities¶

Entity	Regex/Model	Purpose
GST Number	Pattern Match	Validating tax compliance (Stage 3)
CIN/PAN	Pattern Match	Master data verification
Order ID	Contextual NER	Linking emails to ERP orders
Dates (NIT)	DateParser	Tracking tender bidding deadlines

3. Tender IQ (OCR & Vision)¶

Purpose¶

While the core of Pebble is email orchestration, Phase 1 (POC) includes basic Local OCR for extracting text from PDF attachments (e.g., POs, GST docs). Phase 3 adds deep VLM capabilities.

Pipeline¶

OCR Engine: Local Ensemble (Tesseract / PaddleOCR / Llama Vision). We dynamically select the best model per document type to ensure high fidelity beyond standard OCR.
Layout Analysis: Identifying tables of "Item Codes" and "Quantities".
Similarity Engine: Comparing current NIT specs against historical "Closed Won" tender results.

4. Confidence & Human-in-the-Loop¶

Pebble follows an "Augment, Don't Automate" philosophy. Every AI decision includes a confidence score that determines UI behavior.

Confidence	UI Behavior
High (>90%)	Card auto-placed; Label applied with green check.
Medium (70-90%)	Card placed; "Suggested Intent" shown for confirmation.
Low (<70%)	Card sent to "Unclassified" queue for manual triage.

5. Pebble IQ: Retrieval-Augmented Generation (RAG)¶

RAG is the "Intelligence Core" that connects the Ingestion Layer to the User Interface. It ensures that every action is informed by the organization's entire historical knowledge.

Where RAG sits in the Flow¶

RAG operates after Aggregation and before Triage. It indexes every incoming email, document, and CRM interaction into a high-dimensional vector space.

User Journeys & Interfaces¶

A. The Global AI Assistant (Search-to-Action)¶

Interface: A persistent search bar at the top of the app (Cmd+K).
Interaction: User asks: "Who was the last person to handle the Tata Chemicals quote in 2025?"
RAG Action: AI retrieves relevant emails and CRM notes, summarizes them, and provides a direct link to the record.

B. The Sidecar AI (Kanban Context)¶

Interface: A slide-out panel that appears when a Kanban card is opened.
Interaction: As the user views a lead, the AI highlights: "Similar enquiry handled by John last month (Link). Recommended pricing: ₹125/kg."
RAG Action: AI performs a semantic search for similar intent patterns and surfaces historical "Precedent" data.

C. Unified Timeline Synthesis (CRM Master)¶

Interface: The "Engagement Timeline" tab in the CRM.
Interaction: Instead of scrolling 100 emails, the user clicks "Summarize Thread".
RAG Action: AI retrieves all context (attachments, PDF bids, email bodies) and generates a 3-bullet executive summary.

E. Decision Support (Reducing Owner Bottleneck)¶

Interface: Draft Response Sandbox in the Kanban side-panel.
Interaction: For a "Price Enquiry", AI generates a draft: "Subject to stock, the price is ₹125/kg. Last year's batch was ₹120/kg."
RAG Action: AI cross-references the current enquiry with Tally/ERP live stock and historical transaction records.
Impact: Staff can reply confidently in seconds without waiting for owner approval, resolving the "Staff Knowledge Gap".

6. Technical Foundation: Agentic Orchestration¶

Pebble IQ is built on the principles of Agentic Orchestration (leveraging state-of-the-art patterns like LangGraph). This allows us to move beyond simple automation to autonomous decision-making.

Open-Source Lineage¶

We leverage and extend several frontier patterns as our technological base: - LangGraph patterns: Multi-agent flows for grounded RAG responses - Semantic mail systems: Local embeddings for entity-centric inbox threading
- Agent-based CRM: Autonomous lead qualification and sentiment-based routing

Decision-Ready Guardrails¶

Every AI action is governed by a Confidence & Compliance layer: - Low Confidence (<70%): Auto-escalate to human owner with a summarized technical brief. - Med Confidence (70-90%): Present "Suggested Decision" to staff for one-click approval. - High Confidence (>90%): Auto-process routine tasks (e.g., "Where is my order?" via Tally lookup).

7. Model Training & Evaluation¶

Continuous Learning Loop¶

flowchart LR
    A["Production"] --> B["Log Predictions"]
    B --> C["User Moves Card"]
    C{"Correction?"}
    C -->|Yes| D["Log as Negative Sample"]
    C -->|No| E["Log as Positive Sample"]
    D & E --> F["Retrain Intent Model"]
    F --> G["Deploy Updated Weights"]

8. Universal Knowledge Transfer¶

Problem: Institutional knowledge is often siloed in people's heads. If a manager leaves, the logic for "why" an email was routed a certain way is lost.

Pebble Solution: By capturing the "Why?" for every AI correction, we turn the system into a Portable SOP Engine.

The Knowledge Asset: The fine-tuning isn't just "updating weights"—it's building a structured library of business logic.
Universality: Because the logic is stored in a Dual-Database (Rules + Vector Memory), it can be exported and applied to other business applications (Tenders, Finance, Quality) seamlessly. We aren't just training a classifier; we are building an Organizational Brain.

9. The "Master Sheet" Strategy (The AI's Brain)¶

We use a central Decision Matrix (Intent → Action → Steps) as the deterministic source of truth.

Direct Ingestion: The AI scans this sheet to find matches. It doesn't "guess" if a match exists; it follows the Series of Actions exactly as defined.
n8n Orchestration: We use n8n.io to bridge the logic between Outlook, Plane.so, and the Master Sheet.
Self-Expanding Logic: If an email arrives that isn't in the sheet, the system asks to add a new row based on the human's manual move.

← Back to Home | View Product Delivery Roadmap