Mohammad Norouzi
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
I don't know if you've seen the open source community is a little upset because of the safety image that shows up.
Reddit was really lashing out at our engineers.
And one of our people said, oh, we might fix this.
And they were like, we might?
But the fact is the community needs to also read the documentation and bear with us.
This model is only trained with JSON prompting.
And you have to provide JSON with that particular structure for you to get good quality output.
So I don't know if this is a feature or bug.
We did have some safety built into the model, but that is also detecting incorrect prompts.
So if you just give it a one-word prompt, then you get this image is blocked by safety image back.
But that's because your prompt is not a well-specified JSON.
Now, we don't want people to write in JSON.
We don't think that's a natural way of interacting with these models.
But I do strongly believe that we need to use all the AI innovation to build the best models.
image generation and editing models.
And there had been a lot of progress in language models in the text space.
So the question is, if you want to go from some vague idea to an image, what's the exact process?
How much of the thinking happens in the language space and how much of the thinking happens in the actual kind of pixel generation space?
I know you're an artist, so you should probably tell us
When, for example, we always had this prompt meaning of life.