LLM Orchestration Service

The LLM Orchestration Service is the central intelligence of the CRM Product. It receives user messages, determines the intent, and routes them to the appropriate specialized agent.

Responsibilities

LLM Orchestration Service

The LLM Orchestration Service is the brain of the platform. It assembles prompts, manages context, and interfaces with OpenAI to generate intelligent responses.

Overview

It combines system prompts (tone/persona), conversation history, and RAG context to create a rich prompt for the LLM. It also handles token counting and cost tracking.

Key Features

Prompt Assembly: Combines Tone + History + RAG.
Streaming: Server-Sent Events (SSE) for real-time typing.
Cost Tracking: Calculates cost per request (Input/Output tokens).
Multi-Tone: Supports Professional, Friendly, Emphatic, etc.

Architecture

Component	Tech Stack	Description
Language	Python 3.11	Logic & Orchestration
Framework	FastAPI	Async API
Model	GPT-4o-mini	OpenAI API
Tokenizer	tiktoken	Accurate counting

API Reference

Generate Response

Standard request-response generation.

Endpoint: POST /api/v1/llm/generate
Body:

json

{
  "conversation_id": "uuid",
  "user_message": "What are your hours?",
  "knowledge_base_ids": ["uuid"],
  "stream": false
}

Response:

json

{
  "response": "We are open 9-5.",
  "tokens_used": { "total": 150 },
  "cost": { "total": 0.00005 }
}

Stream Response

Real-time generation via SSE.

Endpoint: POST /api/v1/llm/stream
Body: Same as Generate
Response: Server-Sent Events (data: word)

Prompt Pipeline

Fetch Context: Parallel calls to Tenant (Tone), Conversation (History), and Knowledge (RAG) services.
Assemble Prompt: Construct system message and user context.
Call LLM: Send to OpenAI.
Track Usage: Log tokens and cost.

LLM Orchestration Service ​

Responsibilities ​

LLM Orchestration Service ​

Overview ​

Key Features ​

Architecture ​

API Reference ​

Generate Response ​

Stream Response ​

Prompt Pipeline ​

LLM Orchestration Service

Responsibilities

LLM Orchestration Service

Overview

Key Features

Architecture

API Reference

Generate Response

Stream Response

Prompt Pipeline