Analytics Dashboard
Track performance metrics across your traces. Monitor cost, latency, errors, and token usage with filters and trends.
You've deployed a new agent prompt. Did it get faster? Did it blow through your LLM budget? Did error rates climb? The Analytics Dashboard answers these questions in seconds by aggregating metrics across all your traces.
Quick Start
Open Analytics from the main sidebar. You'll immediately see:
- Trend chart - Your primary metric over time (cost, latency, error rate, etc.)
- Top metrics - Summary cards showing totals: $X spent, Y.Zs average latency, N% error rate
- Breakdown table - The same metric split by framework, agent, model, or tag
By default you're looking at the last 7 days of data. It's queryable in real-time.
Metrics Reference
Total spend across LLM calls (input + output tokens × model pricing).
Breakdown options:
- Input cost vs. output cost
- By model
- By LLM provider
Use cases: Budget tracking, identifying expensive models, cost per agent
Time from trace start to completion.
Statistics:
- Average
- P50, P95, P99 percentiles
- Min / Max
Breakdown by: Span type (LLM, TOOL, RETRIEVER), agent, model
Use cases: Performance monitoring, identifying bottlenecks, SLA compliance
Percentage of traces with errors.
Metrics:
- Number of errors
- Error types breakdown
- Trending up / down
Breakdown by: Error type, span kind, agent, tag
Use cases: Reliability monitoring, error spike detection, pattern analysis
Input + output tokens across all LLM calls.
Breakdown options:
- Total tokens
- Input tokens
- Output tokens
- Average per trace
Use cases: Token budget forecasting, model efficiency comparison, cost per token
Count of traces ingested.
Use for:
- Detecting ingestion drops
- Comparing load across periods
- Identifying seasonal patterns
Use cases: Load tracking, capacity planning, incident detection
Percentage of TOOL spans that returned errors.
Helps identify:
- Unreliable integrations
- API changes breaking your tools
- Network or rate-limit issues
Use cases: Tool reliability monitoring, integration health checks
Filters
Apply these filters to analyze a subset of traces:
| Filter | Options |
|---|---|
| Project | Single project or compare multiple |
| Tag | Custom tags (e.g., environment:prod, model:gpt-4) |
| Framework | LangChain, CrewAI, LangGraph, etc. |
| Agent | Specific agent or workflow name |
| Import Source | Langfuse, LangSmith, Braintrust, Raindrop, or native |
Control what time period you're analyzing:
| Option | Details |
|---|---|
| Presets | Last 24h, 7d, 30d, 90d |
| Custom Range | Pick any start/end date |
| Granularity | Auto (hour, day, week) — affects trend chart bucketing |
Some filters only apply to specific metrics:
| Metric | Available Filters |
|---|---|
| Error Rate | Error type, span kind, detection type |
| Cost | LLM provider, model, region |
| Latency | Span type (LLM, TOOL, RETRIEVER) |
Dashboard Actions
Export the current view to CSV for reports and sharing.
How:
- Click Export (top right)
- Choose metric
- File downloads with breakdown by dimension
Use cases: Reports, RCA documents, sharing with stakeholders
Click any row in the table to see deeper metrics.
Hierarchy:
- Framework → specific agents
- Agent → individual traces
- Traces → span details
Use cases: Investigating metrics, finding root cause, identifying outliers
Create side-by-side comparisons between two time periods.
How:
- Select Compare mode
- Choose two date ranges
- See % change in each metric
Examples: "Yesterday vs. Today" or "This week vs. last week"
Performance
Queries run against a columnar database optimized for analytics. Even 90-day ranges with filters return in under 1 second.
Metrics update every 5 minutes. Raw traces are retained for 90 days (configurable). Aggregated analytics are retained for 2 years.
Limitations
Only numeric metrics are supported. For categorical analysis like "top error messages", use Traces list filters and AI Search instead.