Andy Halliday
๐ค SpeakerAppearances Over Time
Podcast Appearances
And I think that's like kind of the mindset that we're trying to ensure psych is
It's not like I think we're still trying to get people to get away from the, oh, it doesn't work.
There's an expert to help you.
No, this thing can talk to you back and like, yes, help you literally help you understand.
But it's a hard kind of shift to think about it that way.
So a couple of comments.
One is that transcription of audio is important, not only for using it against media that's generated and doesn't have captioning, etc.
But right now, you know, getting a
speech to text transcription out of something is pretty well established.
You get stuff, but 11 labs has always been at the forefront.
And in terms of accuracy and fluid fluency, let's call it.
The question is whether in the long run,
You know, we don't even think about that anymore, except in the narrow context of where you want to place captioning or text somehow in association with audio.
Because, you know, one of the uses that we have for getting the transcript out is in order to allow large language models to then process the transcript text.
As opposed to the audio.
But multimodal LLMs are going to be able to work in straight audio, you know, with better and better facility.
And so eventually you may not ever need to do that intermediate step of getting a transcription.