Mohammad Norouzi
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
We test our models based on this prompt.
So if you have meaning of life as your prompt, then do you want an image and diffusion model to decide what the meaning of life is?
Or do you want...
a language model to kind of think and go back and forth and come up with a description of a scene that's explaining the meaning of life.
And that's kind of the context of JSON prompting is the intermediate representation that we think language models can describe images in that format and then image generation can happen.
We see a lot of editing happening in the field, and that's the new frontier.
So I don't think we should expect the interaction to be only through text or JSON, but it's a combination of JSON and image, if I were to make a guess.
And then I think everybody else does it too.
OpenAI does it.
Google does it.
But then they don't give you the actual input to the model.
But again, for professional use cases, you don't want to just roll the dice and then get some other completely different image interpretation of your prompt.
We show you the actual input to the model as well in the JSON format.
And we think that that will foster more innovation and creativity.
Maybe zooming out a bit, the world is changing, right?
I'm sure your work has changed a lot.
My work has changed a lot.
I'm actually writing PRs now, which is very exciting.