The Story of Olly
Autonomous Observability Agent

Powering Quality KPIs with AI

The observability paradox

0 dashboards

0 alerts/week

0 TB logs/day

More data. More dashboards to learn. More queries to master.
Same question: "What's broken and why?"

We looked around...

💬 AI query builders

📊 Prompt-to-dashboard

🔔 Smart alert tuning

Faster dashboards. Easier queries.
But still no answer to: "Why is it broken?"

We asked ourselves:

What can LLMs actually change
for observability?

The metrics that matter:

MTTR MTTD Change Failure Rate

Not faster dashboards — faster root cause

The change we need to see

Reliability reimagined with LLMs

🧠

Knowledge relevant to
your environment

No more generic answers. Context that understands your stack, your history, your patterns.

💬

Just use
natural language

Ask questions like you talk. No query languages. No dashboard hunting. Just answers.

∞

Break the
human scale problem

When systems grow beyond human comprehension, LLMs bridge the gap between complexity and clarity.

Three hard problems

Context

What services exist? What broke before?

embeddings knowledge graphs incident memory

Scale

Petabytes of logs → limited token window

tool use RAG summarization

Correctness

Did it actually find the root cause?

evals task scoring rubrics

Olly Architecture

Observability Data

📊 APM

🖥️ Infrastructure

🔔 Alerting

📈 Metrics

🔍 Traces & Logs

→

Knowledge

Context of your environment

→

Specialized Agents

🔬 Logs Expert

🕸️ Trace Explorer

📊 Metrics Analyzer

🔐 Security Researcher

🔗 Correlation Agent

💡 Hypothesis Generator

...and more

→

✨

Invaluable
Production
Insights

Measuring correctness: Evals

"What caused the checkout spike last night?"

+100Retrieved relevant logs

+200Identified correct service

+500Correlated to deployment

+1000Correct root cause

−50Irrelevant metrics

−200Wrong service blamed

Not "did it sound smart?" — "did it find the bug?"

Demo

"We had checkout errors spike last night.
What changed and what's the likely cause?"

What we're seeing

60 min → 5 min

Daily health checks

"Not just faster — smarter. Complex investigations become clear answers."

MTTR ↓ MTTD ↓ Alert fatigue ↓ Fewer handoffs

Where this goes

Today

You ask, Olly investigates

Olly triages before you wake

Vision

Prevent before customers notice

Reactive firefighting → Proactive reliability

Questions?

The Story of OllyAutonomous Observability Agent