Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Nathaniel Whittemore

๐Ÿ‘ค Speaker
17502 total appearances

Appearances Over Time

Podcast Appearances

As we discussed in that recent show, Anthropic doesn't have a native image generator, so the way that they're creating those designs is a little bit different.

And it seems pretty likely to me that there are going to be certain types of UI implementations that simply will not be possible with Cloud Design and Cloud Code, but that will be with the integration of GPT-Image2.

What's more, people are really excited for when we get the next base model with this as well.

Simon Smith writes, And of course, even if OpenAI weren't bringing the pieces together, there are plenty of entrepreneurial people out there who will do that for them.

Something Big is Happening author Matt Schumer dumped Image2 into the general agent that he built, leading to it generating slide decks and apps that, in his words, look like they were designed by pros.

Leon Lin already posted a new skill that takes advantage of GPT-Images2 to GitHub to make the integration between Images2 and Codex even smoother.

Now, I will say that if the vast majority of people's experiences so far have been positive, that wasn't universally the case.

Boyantungus writes, I tried making an infographic using GPT image 2, lots and lots of visually unacceptable artifacts.

Someone did suggest that his settings might have been set on low, but obviously that's still going to be an issue in terms of the actual utility of the thing.

Speaking of which, journalist Sharon Goldman tested it by asking the model to create an anatomically correct labeled image of the human thorax to be reviewed by her sister who is a professor of anatomy at a med school.

It looked great, but her sister pointed out that there was an extra set of veins, labels pointing to the wrong parts, and some issues with where things were placed.

And while obviously this is still a major improvement from what we had before, there are use cases like this one, where the tolerance for mistakes is not 5%, but 0%.

One of the things that I think will be really interesting to see is how many of the new use cases that get unlocked by this new model actually get deployed in practice.

For example, one of the things that it can do now is much better, more richer editorial layouts.

And yet, is there a group of people who actually need to create editorial layouts who will be willing to trade the controls that they lose in terms of their existing processes for the speed or quality that this new approach represents?

I don't think the answer is going to be clear-cut there.

Another example is precision marketing assets.

Image2 we can already see does an awesome job with things like visual Instagram ads, but will it be people who are creating existing Instagram ads using their own dialed-in workflows with still even more fine-grained specific controls, or will the unlock be more about the democratization of the ability to create that type of image or asset from two other types of people?

I think overall, we're still figuring out what it really means and where the value lies in having reasoning over images.

I think we're still figuring out where the line of controllability needs to be to make these skills useful, not just novel.