Ask HN:您如何進行 AI / LLM 的整合測試?
Hacker News·
一位 Hacker News 使用者正在尋求關於 AI 和大型語言模型 (LLM) 有效整合測試策略的建議,並承認其固有的非確定性所帶來的挑戰。
In hindsight, a simple integration test that sends a real (non-mocked) request to the LLM provider would probably have caught this.
One idea is to have a test suite which sends each prompt to the LLM provider and checks whether the response matches the expected schema. This has its own issues since LLMs are inherently nondeterministic and these tests might be flaky, but I’m currently lacking better ideas.
Curious to hear how others approach this.

相關文章