Ask HN:您如何進行 AI / LLM 的整合測試?

Hacker News·

一位 Hacker News 使用者正在尋求關於 AI 和大型語言模型 (LLM) 有效整合測試策略的建議,並承認其固有的非確定性所帶來的挑戰。

Image

In hindsight, a simple integration test that sends a real (non-mocked) request to the LLM provider would probably have caught this.

One idea is to have a test suite which sends each prompt to the LLM provider and checks whether the response matches the expected schema. This has its own issues since LLMs are inherently nondeterministic and these tests might be flaky, but I’m currently lacking better ideas.

Curious to hear how others approach this.

Image

Hacker News

相關文章

  1. Ask HN:您如何進行 AI / LLM 的整合測試?

    4 個月前

  2. AI 的 CI/CD:每次提交都運行評估

    7 個月前

  3. Ask HN:如何說服您的CTO重新考慮採用AI?

    4 個月前

  4. HN 提問:如何在大型程式碼庫中改善 AI 程式碼編寫與除錯

    4 個月前

  5. 像測試軟體一樣測試大型語言模型代理:AI系統的行為驅動評估

    6 個月前