Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Nathaniel Whittemore

๐Ÿ‘ค Speaker
17502 total appearances

Appearances Over Time

Podcast Appearances

that you really could get not just pretty images, but very realistic-looking images, including things like not-so-great regular photography.

Pietro Serrano shared the output of a photo of a computer screen displaying a Spotify playlist at night, adding, "...it's an insane model and a true imagination engine with an incredible level of realism and small details."

Now, while the realism is impressive, a lot of people jump straight to the implications of the massively improved text and detail handling.

It unlocks things like entire comic panels.

And by the way, the ability to generate multiple images and to keep character consistency makes larger editorial generation along these lines much more possible as well.

The detail in text was showed off in all sorts of different ways.

Iman Moustak created the periodic table of the original 151 Pokemon.

Kris Kastenova did a Where's Waldo-style illustration, placing herself in a densely crowded New York City scene, while others, like Nick Duns, took messy handwritten photos, asking ChatGPT to, quote, get rid of the creases and make it a scan, with both of the generated outputs not only perfectly capturing all the information on the pages, but even preserving Nick's handwriting.

Other people are experimenting with all sorts of other use cases, taking an image of a house and turning it into a generated floor plan, improving the visual quality of graphs, making technical diagrams, brand kits, combined styles, and more.

One of the craziest tests showing off Image 2's world knowledge came from entrepreneur and content creator Riley Brown.

He asked the model to create an image of a specific book, including a barcode which would actually take you to that publication, and it actually worked.

he used a barcode scanner on his phone to test the image, and sure enough, it actually took him to that specific publication.

Testing to make sure it wasn't the ISBN number, he even covered that part up, just leaving the barcode, and it still worked.

Still, maybe the most common thing that I saw explored was UI and software designs.

And this gets into what I think is actually really important about not just this model, but the context into which this model is coming.

I think this is the first image model whose biggest impact is not going to be standalone viral moments like Ghibli, but has the potential to actually be integrated quite quickly into the agentic stack.

Prins on X writes, Images 2.0 is the first model I have ever tried that feels ready for real enterprise workflows.

It's a reasoning model, which means it will search the web, use tools, and think about your request before generating the image.

It is able to generate huge volumes of text without a single error.