multi-agent email categorization with conditional routing
Implements a LangGraph StateGraph-based workflow that routes incoming emails through specialized AI agents for intelligent classification into product_inquiry, complaint, feedback, or unrelated categories. Uses conditional routing nodes that branch the workflow based on categorization results, enabling different processing paths for each email type. The categorization agent leverages LangChain with Groq/Google APIs to analyze email content and metadata, with routing decisions persisted in a custom GraphState object that maintains context across workflow steps.
Unique: Uses LangGraph's StateGraph with explicit conditional routing nodes rather than simple if-then logic, enabling complex multi-path workflows where each category branch can have different processing logic, agent chains, and quality gates. The custom GraphState maintains full context across routing decisions, allowing downstream nodes to access categorization confidence and reasoning.
vs alternatives: More flexible than rule-based email routers (Zapier, Make) because routing logic is LLM-driven and can understand semantic intent; more maintainable than custom regex-based categorization because agent prompts can be updated without code changes.
rag-powered contextual email response generation
Generates customer support responses by combining retrieval-augmented generation (RAG) with ChromaDB vector store and Google Embeddings. For product_inquiry emails, the system retrieves relevant product documentation from the vector store using semantic similarity search, then passes retrieved context to a writing agent that generates contextually appropriate responses. Uses a two-stage pipeline: (1) embedding-based retrieval of top-k relevant documents from ChromaDB, (2) LLM-based response generation conditioned on retrieved context. The vector store is pre-populated via create_index.py which chunks and embeds product documentation.
Unique: Implements a two-stage RAG pipeline where retrieval is decoupled from generation through explicit ChromaDB queries, allowing fine-grained control over chunk size, retrieval strategy, and context window management. The writing agent receives retrieved context as structured input rather than concatenated strings, enabling more sophisticated prompt engineering and context ranking.
vs alternatives: More accurate than non-RAG response generation because responses are grounded in actual product documentation; more maintainable than hardcoded response templates because documentation updates automatically propagate to responses without code changes.
standalone batch email processing mode
Provides a standalone execution mode (main.py) that runs the email processing workflow as a continuous background process without requiring API deployment. The standalone mode fetches emails from Gmail in a loop (configurable polling interval), processes each email through the workflow, and sends responses. Useful for development, testing, and simple deployments where API infrastructure is not needed. Includes console logging for monitoring and debugging. Can be run as a systemd service or Docker container for production use.
Unique: Implements standalone execution as a simple polling loop in main.py rather than requiring external orchestration tools, making it easy to run locally or in simple environments. Integrates directly with the LangGraph workflow without API abstraction, reducing complexity.
vs alternatives: Simpler to set up than API-based deployment because it requires no web server or load balancer; easier to debug because all execution happens in a single process with full console visibility.
automated email quality assurance and proofreading
Implements a quality assurance node that validates generated responses before sending using a specialized proofreading agent. The QA agent checks for grammatical errors, tone consistency, factual accuracy (by comparing against retrieved context), and compliance with support guidelines. Uses LangChain agents with Groq/Google APIs to perform multi-dimensional quality checks, returning a quality score and list of issues. If quality score falls below a threshold, the response is flagged for human review rather than auto-sent. The QA node is integrated into the workflow graph as a post-generation step before email sending.
Unique: Integrates QA as an explicit workflow node in the LangGraph StateGraph rather than a post-processing step, enabling conditional routing based on quality scores (e.g., high-quality responses auto-send, low-quality responses route to human review queue). Uses multi-dimensional quality checks (grammar, tone, factuality, compliance) rather than single-metric scoring.
vs alternatives: More comprehensive than simple spell-checking because it validates factual accuracy against retrieved context and checks tone/compliance; more maintainable than hardcoded validation rules because quality criteria can be updated via agent prompts without code changes.
continuous gmail inbox monitoring and polling
Implements a polling-based email monitoring system that continuously fetches new emails from Gmail inbox using the Gmail API with authenticated access. The monitoring node runs in a loop (configurable polling interval) and retrieves unread emails, parses email metadata (sender, subject, timestamp, body), and feeds them into the processing workflow. Uses Gmail API's label-based filtering to identify new emails and marks processed emails as read to avoid reprocessing. The polling mechanism is integrated into the main.py entry point for standalone deployment or exposed as an API endpoint in deploy_api.py for service-based deployment.
Unique: Implements polling as a first-class workflow component integrated into the LangGraph StateGraph rather than a separate background job, allowing the monitoring loop to be paused, resumed, or modified based on workflow state. Uses Gmail API label-based filtering and read/unread status to maintain idempotency without requiring external state tracking.
vs alternatives: More reliable than webhook-based approaches because polling doesn't depend on firewall rules or public IP addresses; more maintainable than custom email parsing because it uses official Gmail API rather than IMAP/POP3 which are fragile.
stateful workflow orchestration with langgraph stategraph
Implements the core workflow orchestration using LangGraph's StateGraph primitive, which manages the entire email processing pipeline as a directed acyclic graph (DAG) of nodes and edges. Each node represents a processing step (categorization, retrieval, generation, QA, sending), and edges define the control flow between nodes. The custom GraphState object maintains workflow state across all steps, including email content, categorization results, retrieved context, generated response, and QA decisions. Conditional edges enable branching logic (e.g., route to different nodes based on email category). The StateGraph is compiled into an executable workflow that can be invoked synchronously or asynchronously.
Unique: Uses LangGraph's StateGraph as the primary orchestration primitive rather than building custom workflow logic, providing native support for conditional routing, node composition, and state management. The custom GraphState object is explicitly defined and typed, enabling IDE autocomplete and type checking across all workflow steps.
vs alternatives: More transparent than orchestration frameworks like Airflow or Prefect because the entire workflow is defined in Python code and can be inspected/debugged at runtime; more flexible than simple function chaining because conditional edges enable complex branching logic based on intermediate results.
multi-llm provider abstraction with groq and google ai
Provides a unified interface for invoking multiple LLM providers (Groq and Google AI) through LangChain's abstraction layer, enabling agent implementations to be agnostic to the underlying LLM provider. The system uses LangChain's ChatGroq and ChatGoogle integrations to instantiate LLM instances, which are then passed to agent definitions. Agents can be configured to use different providers for different tasks (e.g., Groq for fast categorization, Google for higher-quality response generation). The provider selection is configurable via environment variables, allowing deployment-time switching without code changes.
Unique: Abstracts provider differences through LangChain's unified ChatModel interface rather than building custom provider adapters, enabling agents to be written once and deployed with different providers. Configuration is environment-variable driven, allowing provider switching at deployment time without code changes.
vs alternatives: More maintainable than hardcoding provider-specific API calls because LangChain handles API differences; more flexible than single-provider systems because different tasks can use different providers optimized for their specific requirements.
vector store indexing and semantic search with chromadb
Implements a document indexing pipeline (create_index.py) that chunks product documentation, generates embeddings using Google Embeddings API, and stores them in ChromaDB vector store for later semantic retrieval. The indexing process: (1) reads product documentation files, (2) chunks documents into overlapping segments (configurable chunk size/overlap), (3) generates embeddings for each chunk using Google Embeddings, (4) stores chunks and embeddings in ChromaDB with metadata. During response generation, the RAG pipeline queries ChromaDB using semantic similarity search to retrieve top-k relevant chunks, which are then passed to the writing agent. ChromaDB provides in-memory or persistent storage options.
Unique: Implements indexing as a separate, explicit pipeline (create_index.py) rather than embedding documents on-demand during retrieval, enabling pre-computation of embeddings and offline optimization. Uses Google Embeddings API for consistency with the response generation pipeline, ensuring embedding model alignment.
vs alternatives: More efficient than on-demand embedding because embeddings are pre-computed; more flexible than hardcoded knowledge bases because documentation can be updated by re-running the indexing pipeline without code changes.
+3 more capabilities