Metrics Reference

Comprehensive reference for all CloudWatch metrics emitted by the NorthBuilt RAG System.

Overview

The system emits custom metrics to two CloudWatch namespaces:

Namespace Purpose
RAG/Retrieval Query performance, LLM usage, vector search
RAG/Ingestion Document ingestion, sync jobs, webhooks

Metrics are emitted via Python classes in lambda/shared/utils/metrics.py and are consumed by the Dashboard Lambda for display in the React dashboard.


RAG/Retrieval Namespace

Tracks chat handler performance including vector search, LLM generation, and token usage.

RAGMetrics Class

Query and Retrieval Metrics

Metric Unit Dimensions Description Dashboard Card
QueryLatencyMs Milliseconds - Total query processing time Latency Trends
RetrievalLatencyMs Milliseconds HasFilter, RerankingEnabled Time for Bedrock KB vector search Latency Trends
LLMGenerationLatencyMs Milliseconds - Time for Bedrock LLM response generation Latency Trends
CandidatesRetrieved Count HasFilter Raw results from vector search -
ResultsAfterFilter Count HasFilter Results after post-filtering Avg Documents
FilterEffectiveness None (0-1) HasFilter Ratio of filtered results Filter Effectiveness

Token Usage Metrics

Metric Unit Dimensions Description Dashboard Card
LLMInputTokens Count - Input tokens consumed Token Usage Chart
LLMOutputTokens Count - Output tokens generated Token Usage Chart

Error Metrics

Metric Unit Dimensions Description Dashboard Card
Errors Count ErrorType Error counts by type Error Distribution

ErrorType Dimension Values:

  • RetrievalError - Vector search failures
  • LLMError - Bedrock LLM invocation failures
  • FilterError - Post-filtering failures
  • ValidationError - Input validation failures
  • UnknownError - Uncategorized errors

RAG/Ingestion Namespace

Tracks all document ingestion activities including webhooks, sync jobs, and classification.

IngestionMetrics Class

Webhook Metrics

Metric Unit Dimensions Description Dashboard Card
WebhooksReceived Count Source, Success Webhook events received Webhooks Received
WebhookProcessingLatencyMs Milliseconds Source Time to process webhook event -

Document Ingestion Metrics

Metric Unit Dimensions Description Dashboard Card
DocumentsIngested Count Source, Category, SourceType Documents saved to S3 Documents Ingested
IngestionErrors Count Source, ErrorType, SourceType Ingestion failures Ingestion Errors

Sync Job Metrics

Metric Unit Dimensions Description Dashboard Card
SyncJobsCompleted Count Source, Completed Sync operations completed Sync Jobs
SyncDurationSeconds Seconds Source Total sync job duration -
ItemsSynced Count Source New items added Sync Outcomes
ItemsSkipped Count Source Already existing items Sync Outcomes
ItemsFailed Count Source Items that failed to sync Sync Outcomes
SyncAPICallsTotal Count Source External API calls made API Calls
SyncS3SavesTotal Count Source S3 save operations S3 Saves
SyncProcessingRate Count/Second Source Items processed per second Processing Rate

Source Dimension Values: fathom, helpscout

Category Dimension Values: meeting-transcript, customer-conversation, issue

SourceType Dimension Values: webhook, polling

ClassificationMetrics Class

Classification Metrics

Metric Unit Dimensions Description Dashboard Card
ClassificationsTotal Count Source, Success Classification attempts Classifications
ClassificationLatencyMs Milliseconds Source DynamoDB lookup time Avg Latency
ClassificationMatched Count Source Whether match was found (1/0) Match Rate
ClassificationErrors Count Source, ErrorType Classification failures Classification Errors

ErrorType Dimension Values: ConfigurationError, ValidationError, StrategyError

OrchestratorMetrics Class

Sync Handler Metrics

Metric Unit Dimensions Description Dashboard Card
SyncHandlerInvocations Count Source, Success Handler invocations Handler Invocations
SyncHandlerErrors Count Source, ErrorType Handler failures Handler Errors

ErrorType Dimension Values: ConfigurationError, LambdaInvokeError, HandlerError

KBIngestionMetrics Class

Knowledge Base Ingestion Metrics

Metric Unit Dimensions Description Dashboard Card
IngestionJobStarted Count - KB ingestion job started Jobs Started
IngestionJobAlreadyRunning Count - Job skipped (already running) Jobs Skipped
IngestionJobErrors Count ErrorType Ingestion job failures Ingestion Errors

ErrorType Dimension Values: BedrockAPIError, ConfigurationError, HandlerError


Dashboard API Response Structure

The Dashboard Lambda at /lambda/node/dashboard/index.js queries these metrics and returns them in this structure:

interface MetricsResponse {
  timeRange: '24h' | '7d' | '30d';
  period: number;
  generatedAt: string;

  // Summary objects
  summary: MetricsSummary;
  tokenSummary: TokenSummary;
  ingestionSummary: IngestionSummary;
  classificationSummary: ClassificationSummary;
  syncHandlerSummary: SyncHandlerSummary;
  syncWorkerSummary: SyncWorkerSummary;

  // Time series
  timeSeries: {
    queryVolume: DataPoint[];
    retrievalLatency: DataPoint[];
    llmLatency: DataPoint[];
    errors: DataPoint[];
    documentsIngested: DataPoint[];
    webhooksReceived: DataPoint[];
    classifications: DataPoint[];
    inputTokens: DataPoint[];
    outputTokens: DataPoint[];
  };

  // Dimensional breakdowns
  errorsByType: ErrorByType[];
  ingestionBySource: IngestionBySourcePoint[];
  syncOutcomesBySource: SyncOutcomeBySource[];
  latencyPercentiles: LatencyPercentiles;

  logGroups: string[];
}

Metric Thresholds Reference

Query Metrics

Metric Warning Critical Lower is Better
Query Latency >= 2000ms >= 5000ms Yes
Error Rate >= 1% >= 5% Yes
Filter Effectiveness <= 30% <= 10% No
Classification Match Rate <= 70% <= 50% No

Metric Emission Points

Lambda Function Metrics Classes Used Key Metrics
nb-rag-sys-chat RAGMetrics Query, LLM, Token metrics
nb-rag-sys ClassificationMetrics Classification metrics
nb-rag-sys-fathom-webhook IngestionMetrics Webhook metrics
nb-rag-sys-fathom-sync OrchestratorMetrics, KBIngestionMetrics Handler metrics
nb-rag-sys-fathom-sync-worker IngestionMetrics Sync metrics

Last updated: 2026-01-16