Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Engineering Now

#8: Who Validate the Validator? - 継続的な評価をアップデートする仕組み -

04 Nov 2024

Description

継続的にLLMアプリケーションの評価基準や自動評価をアップデートする仕組みであるEvalGenについて書かれた論文「Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences」について話しました。 ポッドキャストの書き起こしサービス「LISTEN」は⁠⁠⁠⁠⁠⁠⁠⁠こちら⁠⁠⁠⁠⁠⁠⁠ Shownotes: https://arxiv.org/abs/2404.12272 https://www.sh-reya.com/blog/ai-engineering-flywheel/ https://www.chainforge.ai/ https://github.com/wandb/evalForge/tree/main https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/ ⁠ 出演者: seya(⁠⁠⁠⁠⁠⁠⁠@sekikazu01⁠⁠⁠⁠⁠⁠⁠) kagaya(⁠⁠⁠⁠⁠⁠⁠@ry0_kaga⁠⁠⁠⁠⁠⁠⁠)

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.