I don't understand DALL-E ... can someone explain?
August 13, 2022 7:08 AM   Subscribe

I feel like this might be such a dumb question that I'm going to get mocked, but please bear with me. I just tried this "DALL-E" image thing ... is it searching the universe for pictures that fit my description, or is it literally making them up from scratch? I asked for a picture of a bowl of oranges against a blue backdrop -- are these somebody's real oranges, or is this completely new and made up only after I asked for it?
posted by mccxxiii to Computers & Internet (15 answers total) 3 users marked this as a favorite
 
Best answer: It is making them up based on the hundreds of millions of images that were used to train it. The images are created when you ask for them.
posted by wesleyac at 7:18 AM on August 13 [3 favorites]


Best answer: Rhaomi had a post on MetaTalk a while back asking for DALL-E requests - I think it's fairly apparent from the requests and the output that it's not an existing image, although its algorithm probably looks at existing images to create its final product.
posted by LionIndex at 7:19 AM on August 13


Best answer: It is making them from scratch. It is referencing real images to obtain your results. There are a lot of highly technical explainers out there, but this one is somewhat accessible?

The program runs your text through an encoder, that references its priors, and then complies an image based on those priors. It is generating a new image, but is building that image from the previous instances that it has experienced what the program thinks you're referencing.

While I've had some fun with the program, I've found the most fun is feeding it the most vague possible information to see how it interprets those things. [famous thing] in the style of [specific art style] will get you a really good example of the program pulling a 'known' or commonly referenced image, and then applying a style to it,

It seems to fully grok 'muppets' so 'steely dan as muppets' turns out a delightful result, and also a good example of how those images probably don't exist anywhere, but the 'style' of muppets is being applied to famous photos of the band.
posted by furnace.heart at 7:22 AM on August 13 [4 favorites]


Response by poster: OMG. This is crazy. Also maybe bad for humanity in the long run, but I'm about to go make some Steely Dan Muppets so la di dah ...
posted by mccxxiii at 7:28 AM on August 13 [19 favorites]


A great great book on AIs is You Look Like a Thing And I love you by Janelle Shane
posted by chesty_a_arthur at 8:11 AM on August 13 [3 favorites]


PLEASE return with proof of the Steely Dan Muppets. kkthx
posted by Glinn at 9:53 AM on August 13 [9 favorites]


In the interest of accuracy, it's a little more complicated than "making them up from scratch". It's eminently possible that what you end up seeing is in fact a very close match to someone's real oranges. These models learn on kajillions of input images, and their occasional tendency to memorize their training data and regurgitate it with few to no modifications is well known. Sometimes these models glitch out and just return a training image.

But yes, usually what you're going to see is a new, original image that blends the concepts that you're including in the prompt. It's weird and mind-blowing that it works as well as it does.
posted by potrzebie at 10:20 AM on August 13


Okay I was dying to try that as well and this one came out somewhat usable:

Photo of Steely Dan as muppets on stage in the 1970s photorealistic.
posted by JoeZydeco at 10:21 AM on August 13 [12 favorites]


Response by poster: OK sorry to thread-sit, but where are those millions of training photos coming from?
posted by mccxxiii at 11:50 AM on August 13


According to OpenAI's github documentation the training pairs (words + photos) came from "a combination of publicly available sources and sources that we licensed."

Beyond that there's not much detail yet.
posted by JoeZydeco at 12:18 PM on August 13


Seems like some of them are coming from street views
posted by aniola at 4:19 PM on August 13


I asked it to draw me a firehose and it was like ??? and then I asked it to do a fire hydrant and it was like "sure no problem!" which is why I think it has streetview data.
posted by aniola at 4:20 PM on August 13 [1 favorite]


Also it's really good at the Pacific Coast Highway.
posted by aniola at 4:21 PM on August 13


It excels at anything with Homer Simpson in it, as previous threads have shown.
posted by JoeZydeco at 8:52 AM on August 14


Janelle Shane (whose book was mentioned by chesty_a_arthur above) also has a blog called "AI Weirdness" that has a lot of fun with DALL-E.
posted by LadyOscar at 6:33 PM on August 14 [3 favorites]


« Older I’m glad that you’re sorry now   |   Tell me about your version of "steak and eggs" Newer »

You are not logged in, either login or create an account to post comments