Eve Bodnia
๐ค SpeakerAppearances Over Time
Podcast Appearances
So then we're like, okay, let's actually try to scale it as much as we can.
And we performed a bunch of experiments, and also we have pretty decent theoretical understanding how it works, so we don't see any obstacles.
But, you know, engineering can be tricky, so sometimes things work and sometimes things you need to debug.
So the biggest part for me personally was to...
Ah, how to say it?
So the EBM is not naturally autoregressive because there's no tokens in it.
It's also non-autoregressive.
So meaning it's overseeing all possible scenarios at the same time.
But when you try to attach it to transformers,
Transformers are very autoregressive.
So you have to take this wild thing and attach to something which is thinking very sort of linearly.
Yeah.
So you're facing a huge information loss in the middle.
And then the same thing when you try to prompt using LLM the EBM.
So there is also a giant reduction of the information on that layer.
So we were trying to like orchestrate this layer alone, which took some time and try to see how it scales.
So I think that was the biggest difficulty we faced.
But now the architecture is there.
It's scalable.
It's already like progressing the way we expected and a little bit even beyond.