Show HN:WatchLLM – 以成本歸因方式逐步調試 AI 代理

Hacker News·

WatchLLM 是在 Hacker News 上推出的一款新工具,旨在透過提供決策、工具調用和回應的逐步時間軸,以及每個步驟的成本歸因,來協助開發者調試 AI 代理。它還具備異常檢測和語義快取功能,以降低 LLM 的費用。

Image

  1. Debugging agents is painful - When your agent makes 20 tool calls and fails, good luck figuring out which decision was wrong. WatchLLM gives you a step-by-step timeline showing every decision, tool call, and model response with explanations for why the agent did what it did.

  2. Agent costs spiral fast - Agents love getting stuck in loops or calling expensive tools repeatedly. WatchLLM tracks cost per step and flags anomalies like "loop detected - same action repeated 3x, wasted $0.012" or "high cost step - $0.08 exceeds threshold".

The core features:

Timeline view of every agent decision with cost breakdown
Anomaly detection (loops, repeated tools, high-cost steps)
Semantic caching that cuts 40-70% off your LLM bill as a bonus
Works with OpenAI, Anthropic, Groq - just change your baseURL

It's built on ClickHouse for real-time telemetry and uses vector similarity for the caching layer. The agent debugger explains decisions using LLM-generated summaries of why each step happened.
Right now it's free for up to 50K requests/month. I'm looking for early users who are building agents and want better observability into what's actually happening (and what it's costing).
Try it: https://watchllm.dev
Would love feedback on what other debugging features would be useful. What do you wish you had when your agents misbehave?

Image

Hacker News

相關文章

  1. Show HN:AgentWatch – 監控 AI 代理成本的終端儀表板

    3 個月前

  2. Show HN:ReachLLM 用於追蹤、分析和提升 AI 搜尋可見度

    8 個月前

  3. Show HN:Hawkeye - 一款透過 AI 解釋問題的日誌監控 CLI 工具

    4 個月前

  4. Show HN:AI 代理的混沌工程

    4 個月前

  5. AI代理本質上是整合了LLM的CI管道

    4 個月前