Skip to content

AI & ML Overview

This document describes the AI and machine learning components used in the Pebble Orchestrator (Pebble IQ) for intent recognition, entity extraction, and process intelligence.

[!WARNING] Dynamic Implementation Scope: Please note that specific AI models, parameters, and architectural decisions outlined in this document are subject to evolution during the execution phase. Final implementation choices will be made based on real-world content matching, validation against architectural constraints, and estimation reality, rather than strictly adhering to the initial specifications listed here.


Design Philosophy: Augment, Not Replace

[!IMPORTANT] Pebble IQ is designed to empower staff, not eliminate them. The system provides AI-powered decision support while keeping humans in control.

Why Augmentation Over Automation?

Business Rationale:

  • Domain Expertise: Chemical manufacturing requires deep technical knowledge (batch specs, compliance, pricing nuances) that AI cannot fully replicate
  • Customer Relationships: B2B sales rely on personal trust and context that only long-term staff relationships provide
  • Risk Management: High-value transactions (₹lakhs) require human judgment for edge cases and exceptions
  • Staff Morale: Teams want to be empowered with better tools, not replaced by them

Technical Rationale:

  • Confidence Limits: Even SOTA LLMs have 10-30% uncertainty on complex B2B queries
  • Context Gaps: AI lacks access to unwritten business rules, internal politics, and customer history beyond data
  • Accountability: Customers expect human accountability for pricing, commitments, and compliance

Implementation Impact:

This philosophy increases system complexity and effort because we must build:

  • Confidence scoring & thresholding logic
  • Human review/approval workflows
  • Override & undo mechanisms
  • Audit trails for human decisions

A fully automated system (no human approval) would be less work but deliver less value in this B2B context.


AI Capabilities Summary

Capability Model Type Purpose
Intent Recognition Transformer (BERT/LLM) Routing emails to Sales, Ops, or Tenders
Business Entity Extraction NER / LLM Extracting GST, PAN, CIN, and Order IDs
Tender IQ (OCR) Deep Learning Extracting NIT details and dates from PDFs
Sentiment & Saliency NLU Identifying high-priority/urgent leads
Pebble IQ (RAG) Embeddings + Vector DB Semantic search across Emails, Docs, and CRM

Execution Context: Cloud vs. Local

Pebble supports both cloud-native and private, on-premise AI execution.

Context Recommended Stack Best For
Local (Primary) Ollama (Llama 3 / Mistral) Default for POC. High privacy, zero per-token cost, offline ops.
Cloud (Fallback) Azure OpenAI / Google Gemini Optional fallback if local hardware is insufficient.

1. Agentic Triage & Routing

Purpose

Automatically analyze and route incoming emails into specific operational streams (Sales, Logistics, Quality, Tenders) using an intelligent agent loop.

Model Hierarchy

graph TD
    A["Incoming Email"] --> B{"Agentic Router"}
    B --> C["Sales"]
    B --> D["Logistics/Dispatch"]
    B --> E["Tender/NIT"]
    B --> F["Unclassified"]

    C --> C1["New Quote"]
    C --> C2["Follow-up"]

    D --> D1["Tracking Query"]
    D --> D2["Dispatch Proof"]

Model Options

Model Type Best For
LLM Agent Generative (Llama3/GPT-4) Few-shot reasoning with RAG context (POC Default). Critical for accuracy.
RoBERTa-base Transformer High accuracy, low latency (Optimization phase)
SetFit Few-Shot Training on very small datasets (<50 samples)

2. Business Entity Extraction (NER)

Purpose

Extract structured business identifiers to enable one-click synchronization with CRM and Focus RT.

Critical Entities

Entity Regex/Model Purpose
GST Number Pattern Match Validating tax compliance (Stage 3)
CIN/PAN Pattern Match Master data verification
Order ID Contextual NER Linking emails to ERP orders
Dates (NIT) DateParser Tracking tender bidding deadlines

3. Tender IQ (OCR & Vision)

Purpose

While the core of Pebble is email orchestration, Phase 1 (POC) includes basic Local OCR for extracting text from PDF attachments (e.g., POs, GST docs). Phase 3 adds deep VLM capabilities.

Pipeline

  1. OCR Engine: Local Ensemble (Tesseract / PaddleOCR / Llama Vision). We dynamically select the best model per document type to ensure high fidelity beyond standard OCR.
  2. Layout Analysis: Identifying tables of "Item Codes" and "Quantities".
  3. Similarity Engine: Comparing current NIT specs against historical "Closed Won" tender results.

4. Confidence & Human-in-the-Loop

Pebble follows an "Augment, Don't Automate" philosophy. Every AI decision includes a confidence score that determines UI behavior.

Confidence UI Behavior
High (>90%) Card auto-placed; Label applied with green check.
Medium (70-90%) Card placed; "Suggested Intent" shown for confirmation.
Low (<70%) Card sent to "Unclassified" queue for manual triage.


5. Pebble IQ: Retrieval-Augmented Generation (RAG)

RAG is the "Intelligence Core" that connects the Ingestion Layer to the User Interface. It ensures that every action is informed by the organization's entire historical knowledge.

Where RAG sits in the Flow

RAG operates after Aggregation and before Triage. It indexes every incoming email, document, and CRM interaction into a high-dimensional vector space.

User Journeys & Interfaces

A. The Global AI Assistant (Search-to-Action)

  • Interface: A persistent search bar at the top of the app (Cmd+K).
  • Interaction: User asks: "Who was the last person to handle the Tata Chemicals quote in 2025?"
  • RAG Action: AI retrieves relevant emails and CRM notes, summarizes them, and provides a direct link to the record.

B. The Sidecar AI (Kanban Context)

  • Interface: A slide-out panel that appears when a Kanban card is opened.
  • Interaction: As the user views a lead, the AI highlights: "Similar enquiry handled by John last month (Link). Recommended pricing: ₹125/kg."
  • RAG Action: AI performs a semantic search for similar intent patterns and surfaces historical "Precedent" data.

C. Unified Timeline Synthesis (CRM Master)

  • Interface: The "Engagement Timeline" tab in the CRM.
  • Interaction: Instead of scrolling 100 emails, the user clicks "Summarize Thread".
  • RAG Action: AI retrieves all context (attachments, PDF bids, email bodies) and generates a 3-bullet executive summary.

E. Decision Support (Reducing Owner Bottleneck)

  • Interface: Draft Response Sandbox in the Kanban side-panel.
  • Interaction: For a "Price Enquiry", AI generates a draft: "Subject to stock, the price is ₹125/kg. Last year's batch was ₹120/kg."
  • RAG Action: AI cross-references the current enquiry with Tally/ERP live stock and historical transaction records.
  • Impact: Staff can reply confidently in seconds without waiting for owner approval, resolving the "Staff Knowledge Gap".

6. Technical Foundation: Agentic Orchestration

Pebble IQ is built on the principles of Agentic Orchestration (leveraging state-of-the-art patterns like LangGraph). This allows us to move beyond simple automation to autonomous decision-making.

Open-Source Lineage

We leverage and extend several frontier patterns as our technological base: - LangGraph patterns: Multi-agent flows for grounded RAG responses - Semantic mail systems: Local embeddings for entity-centric inbox threading
- Agent-based CRM: Autonomous lead qualification and sentiment-based routing

Decision-Ready Guardrails

Every AI action is governed by a Confidence & Compliance layer: - Low Confidence (<70%): Auto-escalate to human owner with a summarized technical brief. - Med Confidence (70-90%): Present "Suggested Decision" to staff for one-click approval. - High Confidence (>90%): Auto-process routine tasks (e.g., "Where is my order?" via Tally lookup).


7. Model Training & Evaluation

Continuous Learning Loop

flowchart LR
    A["Production"] --> B["Log Predictions"]
    B --> C["User Moves Card"]
    C{"Correction?"}
    C -->|Yes| D["Log as Negative Sample"]
    C -->|No| E["Log as Positive Sample"]
    D & E --> F["Retrain Intent Model"]
    F --> G["Deploy Updated Weights"]

8. Universal Knowledge Transfer

Problem: Institutional knowledge is often siloed in people's heads. If a manager leaves, the logic for "why" an email was routed a certain way is lost.

Pebble Solution: By capturing the "Why?" for every AI correction, we turn the system into a Portable SOP Engine.

  1. The Knowledge Asset: The fine-tuning isn't just "updating weights"—it's building a structured library of business logic.
  2. Universality: Because the logic is stored in a Dual-Database (Rules + Vector Memory), it can be exported and applied to other business applications (Tenders, Finance, Quality) seamlessly. We aren't just training a classifier; we are building an Organizational Brain.

9. The "Master Sheet" Strategy (The AI's Brain)

We use a central Decision Matrix (Intent → Action → Steps) as the deterministic source of truth.

  • Direct Ingestion: The AI scans this sheet to find matches. It doesn't "guess" if a match exists; it follows the Series of Actions exactly as defined.
  • n8n Orchestration: We use n8n.io to bridge the logic between Outlook, Plane.so, and the Master Sheet.
  • Self-Expanding Logic: If an email arrives that isn't in the sheet, the system asks to add a new row based on the human's manual move.

← Back to Home | View Product Delivery Roadmap