the product

condense, trace, debug

agentis takes the raw firehose of agent logs and turns it into something you can work with: a snapshot per run, the full execution path, and an llm that reads it all and tells you why the run failed. one wedge, done well.

agentis.ogbuilds.ai/agents/support-triage

support-triage

snapshot timeline

  1. 11:58:00 AM7 steps

    classified ticket #4821 as billing, drafted a refund reply, escalated to a human for approval.

  2. 11:54:00 AMerror

    tool call to the crm timed out after 3 retries — could not load the customer record.

  3. 11:49:00 AM4 steps

    answered a password-reset question from the knowledge base, no escalation needed.

  4. 11:40:00 AMerror

    looped 6 times re-asking the same clarifying question before giving up — possible prompt issue.

  5. 11:33:00 AM6 steps

    routed an enterprise complaint to the priority queue and notified the on-call lead.

errorjun 19, 11:54

agent classified ticket #4825 as a billing dispute and loaded the customer, but the billing.getInvoices tool timed out after 3 retries. it fell back to a holding reply and escalated to a human.

agentVersion
1.4.2
model
claude-haiku-4-5
environment
production
ticketId
#4825
user
customer 88213

execution path

  1. action12ms

    received ticket #4825 from webhook

  2. llm call840ms

    classify intent → "billing dispute"

  3. tool call1.9s

    crm.getCustomer(id=88213)

  4. tool callerror30.0s

    billing.getInvoices(customer=88213)

    ETIMEDOUT after 3 retries — billing service unreachable

  5. decision

    retry policy exhausted → branch to fallback

  6. llm call760ms

    draft holding reply to customer

  7. action40ms

    escalate to human queue (priority: high)

llm debug

high confidence

the run failed because billing.getInvoices timed out (ETIMEDOUT) after 3 retries — the billing service was unreachable, not a bug in the agent's reasoning. the agent handled it correctly by falling back and escalating.

root cause: billing.getInvoices → ETIMEDOUT (30s, 3 retries) against https://billing.internal/invoices

  1. 1check whether the billing service was down or slow at 11:54 — this is an upstream dependency, not the agent
  2. 2lower the per-call timeout so the agent fails fast instead of burning 30s on a dead endpoint
  3. 3add a circuit breaker so repeated billing failures skip straight to the human queue

generated by claude-opus-4-8

raw logs

11:54:00 AM INFO action received ticket #4825 from webhook {"source":"zendesk","priority":"normal"}
11:54:01 AM INFO llm_call classify intent {"model":"claude-haiku","result":"billing dispute","tokens":412}
11:54:01 AM INFO tool_call crm.getCustomer(id=88213) {"ok":true,"ms":1900}
11:54:33 AM ERROR tool_call billing.getInvoices failed: ETIMEDOUT {"retries":3,"ms":30000,"endpoint":"https://billing.internal/invoices"}
11:54:33 AM WARN decision retry policy exhausted, switching to fallback path
snapshots

a run, condensed to a snapshot

instead of a stream of log lines, each run shows up as one snapshot: a summary of what the agent did, its status, and the entry point into the detail. your agents list becomes a clean health view you can scan in seconds.

  • one-line summary of every run
  • healthy vs failing agents at a glance
  • snapshot history kept per agent
agentis.ogbuilds.ai/agents

agents

3 agents · 2,105 snapshots this week

add agent
  • support-triage37 errors
    active 4d ago
    1,284
    snapshots
  • research-deepdive4 errors
    active 4d ago
    612
    snapshots
  • code-migrator19 errors
    active 4d ago
    209
    snapshots
raw logs + context

the full detail, one click away

snapshots stay readable but never hide the truth. open one and the raw logs and the context the agent was working with are right there, so you can verify the summary and dig into the exact payloads when you need to.

  • raw logs preserved verbatim, casing intact
  • the context and inputs each step saw
  • drop from summary to detail without losing your place
agentis.ogbuilds.ai/agents/support-triage

support-triage

snapshot timeline

  1. 11:58:00 AM7 steps

    classified ticket #4821 as billing, drafted a refund reply, escalated to a human for approval.

  2. 11:54:00 AMerror

    tool call to the crm timed out after 3 retries — could not load the customer record.

  3. 11:49:00 AM4 steps

    answered a password-reset question from the knowledge base, no escalation needed.

  4. 11:40:00 AMerror

    looped 6 times re-asking the same clarifying question before giving up — possible prompt issue.

  5. 11:33:00 AM6 steps

    routed an enterprise complaint to the priority queue and notified the on-call lead.

errorjun 19, 11:54

agent classified ticket #4825 as a billing dispute and loaded the customer, but the billing.getInvoices tool timed out after 3 retries. it fell back to a holding reply and escalated to a human.

agentVersion
1.4.2
model
claude-haiku-4-5
environment
production
ticketId
#4825
user
customer 88213

execution path

  1. action12ms

    received ticket #4825 from webhook

  2. llm call840ms

    classify intent → "billing dispute"

  3. tool call1.9s

    crm.getCustomer(id=88213)

  4. tool callerror30.0s

    billing.getInvoices(customer=88213)

    ETIMEDOUT after 3 retries — billing service unreachable

  5. decision

    retry policy exhausted → branch to fallback

  6. llm call760ms

    draft holding reply to customer

  7. action40ms

    escalate to human queue (priority: high)

llm debug

high confidence

the run failed because billing.getInvoices timed out (ETIMEDOUT) after 3 retries — the billing service was unreachable, not a bug in the agent's reasoning. the agent handled it correctly by falling back and escalating.

root cause: billing.getInvoices → ETIMEDOUT (30s, 3 retries) against https://billing.internal/invoices

  1. 1check whether the billing service was down or slow at 11:54 — this is an upstream dependency, not the agent
  2. 2lower the per-call timeout so the agent fails fast instead of burning 30s on a dead endpoint
  3. 3add a circuit breaker so repeated billing failures skip straight to the human queue

generated by claude-opus-4-8

raw logs

11:54:00 AM INFO action received ticket #4825 from webhook {"source":"zendesk","priority":"normal"}
11:54:01 AM INFO llm_call classify intent {"model":"claude-haiku","result":"billing dispute","tokens":412}
11:54:01 AM INFO tool_call crm.getCustomer(id=88213) {"ok":true,"ms":1900}
11:54:33 AM ERROR tool_call billing.getInvoices failed: ETIMEDOUT {"retries":3,"ms":30000,"endpoint":"https://billing.internal/invoices"}
11:54:33 AM WARN decision retry policy exhausted, switching to fallback path
execution path

visualize how the run unfolded

agentis traces the path through your agent: every tool call, llm call, branch and retry, in order, with the failing step marked. it's the difference between knowing a run failed and knowing where and why.

  • step-by-step execution path in order
  • the failure point flagged
  • timings so you can spot slow steps
agentis.ogbuilds.ai/agents/support-triage

support-triage

snapshot timeline

  1. 11:58:00 AM7 steps

    classified ticket #4821 as billing, drafted a refund reply, escalated to a human for approval.

  2. 11:54:00 AMerror

    tool call to the crm timed out after 3 retries — could not load the customer record.

  3. 11:49:00 AM4 steps

    answered a password-reset question from the knowledge base, no escalation needed.

  4. 11:40:00 AMerror

    looped 6 times re-asking the same clarifying question before giving up — possible prompt issue.

  5. 11:33:00 AM6 steps

    routed an enterprise complaint to the priority queue and notified the on-call lead.

errorjun 19, 11:54

agent classified ticket #4825 as a billing dispute and loaded the customer, but the billing.getInvoices tool timed out after 3 retries. it fell back to a holding reply and escalated to a human.

agentVersion
1.4.2
model
claude-haiku-4-5
environment
production
ticketId
#4825
user
customer 88213

execution path

  1. action12ms

    received ticket #4825 from webhook

  2. llm call840ms

    classify intent → "billing dispute"

  3. tool call1.9s

    crm.getCustomer(id=88213)

  4. tool callerror30.0s

    billing.getInvoices(customer=88213)

    ETIMEDOUT after 3 retries — billing service unreachable

  5. decision

    retry policy exhausted → branch to fallback

  6. llm call760ms

    draft holding reply to customer

  7. action40ms

    escalate to human queue (priority: high)

llm debug

high confidence

the run failed because billing.getInvoices timed out (ETIMEDOUT) after 3 retries — the billing service was unreachable, not a bug in the agent's reasoning. the agent handled it correctly by falling back and escalating.

root cause: billing.getInvoices → ETIMEDOUT (30s, 3 retries) against https://billing.internal/invoices

  1. 1check whether the billing service was down or slow at 11:54 — this is an upstream dependency, not the agent
  2. 2lower the per-call timeout so the agent fails fast instead of burning 30s on a dead endpoint
  3. 3add a circuit breaker so repeated billing failures skip straight to the human queue

generated by claude-opus-4-8

raw logs

11:54:00 AM INFO action received ticket #4825 from webhook {"source":"zendesk","priority":"normal"}
11:54:01 AM INFO llm_call classify intent {"model":"claude-haiku","result":"billing dispute","tokens":412}
11:54:01 AM INFO tool_call crm.getCustomer(id=88213) {"ok":true,"ms":1900}
11:54:33 AM ERROR tool_call billing.getInvoices failed: ETIMEDOUT {"retries":3,"ms":30000,"endpoint":"https://billing.internal/invoices"}
11:54:33 AM WARN decision retry policy exhausted, switching to fallback path
more in the box

built for how agents actually run

the rest of agentis is about getting your logs in and getting answers out fast.

llm debug

ask the built-in llm to read a failed snapshot and explain the likely cause, grounded in your real path and logs — not a generic guess.

one-url ingestion

post your agent's logs to a single endpoint over plain http. no sdk lock-in, no per-framework adapters to maintain.

agent registration

register each agent and get an api key. snapshots are grouped per agent so every service has its own clean history.

path tracing

the execution path is reconstructed for you from the logs you send — tool calls, llm calls, branches and retries, in order.

your logs stay yours

we use your logs to build your snapshots and to run the debug you ask for. never to train models.

health at a glance

the agents view is a status board: which agents are healthy, which are erroring, and how much each is running.

how it works

from logs to answers in three steps

1

register an agent

create an agent in agentis and grab its api key. takes a minute, no card needed.

2

point your logs at one endpoint

send your agent's logs to a single ingest url over http. works with any framework or a hand-rolled loop.

3

read, trace, debug

open the snapshot, follow the execution path, and ask the llm to debug when a run goes wrong.

who it's for

built for the people shipping agents

ai developers

you're building agents and tired of reconstructing failed runs from log dumps. agentis gives you the path and the answer.

mlops engineers

you keep agents healthy in production. agentis is the observability layer that tells you which agent broke and where.

agentic-startup founders

your product is an agent. when it misbehaves for a customer you need to see exactly what it did — fast — and fix it.

use cases

what you reach for agentis to do

debug a production failure

a run failed for a real user. open the snapshot, find the failing step in the path, and ask the llm why — without grepping logs.

understand a flaky agent

an agent works most of the time and fails sometimes. compare snapshots across runs to see what changes when it breaks.

find slow steps

the traced path carries timings, so the step that's quietly eating your latency is easy to spot.

onboard a teammate

instead of explaining your agent's tangled control flow, hand them a snapshot — the path tells the story.

watch your agents work

register an agent, send logs to one endpoint, and read your first traced snapshot in minutes. free while we're in mvp.