Skip to main content
Monocle provides a dedicated AI Agents dashboard for monitoring your application’s AI activity. Track LLM calls, tool usage, token consumption, estimated costs, and full conversation flows.

What Monocle tracks

The dashboard has four tabs:
  • Overview: total calls, token usage, estimated costs, and error rates at a glance
  • Models: per-model breakdown with latency, token consumption, and cost
  • Tools: tool call stats with success/error rates and duration
  • Conversations: multi-turn conversation flows with drill-down into individual runs

Token tracking

Monocle captures detailed token usage per LLM call, including provider-specific breakdowns when available:
  • Input and output tokens
  • Cached input tokens (cache hits)
  • Cache write tokens
  • Reasoning tokens (for models like o1 and o3)
These metrics are used to estimate costs based on published model pricing.

Conversation tracking

You can group multiple LLM calls into a single conversation by setting a conversation ID. This enables the Conversations tab, where you can follow the full flow of an agentic loop: which models were called, which tools were invoked, and how many tokens were consumed at each step.

Trace viewer integration

When viewing a trace that contains AI spans, an AI tab appears in the span details sidebar. It shows token breakdowns, input/output previews, and model information for that specific call.

Span attributes

All AI instrumentations must emit these attributes for the AI dashboard to work correctly.

Standard GenAI semconv attributes

These follow the official OpenTelemetry GenAI Semantic Conventions:
AttributeDescription
gen_ai.operation.nameOperation type (generate_text, stream_text, invoke_agent, execute_tool, etc.)
gen_ai.systemLLM provider name
gen_ai.request.modelRequested model ID
gen_ai.response.modelModel ID used in response
gen_ai.response.idProvider response ID
gen_ai.response.finish_reasonsFinish reasons array
gen_ai.request.temperatureTemperature setting
gen_ai.request.max_tokensMax tokens setting
gen_ai.request.top_pTop-p setting

Token usage attributes

AttributeDescription
gen_ai.usage.input_tokensInput tokens consumed
gen_ai.usage.output_tokensOutput tokens generated
gen_ai.usage.total_tokensComputed total (input + output)
gen_ai.usage.input_tokens.cachedCached/cache-hit input tokens
gen_ai.usage.input_tokens.cache_writeCache write tokens
gen_ai.usage.output_tokens.reasoningReasoning tokens

Tool call attributes

AttributeDescription
gen_ai.tool.nameTool name
gen_ai.tool.call.idTool call ID
gen_ai.tool.typeAlways "function"

Monocle attributes

These custom attributes power specific dashboard features:
AttributeDashboard Feature
gen_ai.conversation.idConversations tab, grouping multi-turn interactions
gen_ai.response.modelModels tab, per-model cost and usage breakdown
gen_ai.tool.nameTools tab, per-tool call stats and error rates
gen_ai.usage.*_tokensToken consumption charts and cost estimation
gen_ai.usage.input_tokens.cachedCache hit rate display and cost savings calculation
gen_ai.function_idPipeline name display
ai.streamingDistinguishing streaming vs non-streaming calls

Supported SDKs

Vercel AI SDK

Auto-instruments generateText, streamText, generateObject, streamObject, embed, embedMany, and rerank

Getting started

See the instrumentation page above for setup instructions, or the AdonisJS guide if you are using AdonisJS.