Hacker News Re-Imagined

How Imagen Works

  • 142 points
  • 12 days ago

  • @SleekEagle
  • Created a post

How Imagen Works


@funstuff007 12 days

Replying to @SleekEagle 🎙

What's the highest price paid for an AI-generated image NFT?

Reply


@varispeed 12 days

Replying to @SleekEagle 🎙

> is trained on hundreds of millions of images and their associated captions

So how do you get access to hundreds of millions of images and use them to create derivative works? Did they get consent from millions of authors?

Or is something like that only available to the rich with access to lawyers on tap?

I mean I can imagine if a nobody wanted to do something like this, they'd get bankrupted by having to deal with all the photographers / artists spotting a tiny sliver of their art in the image produced by the model.

Furthermore, would something like this work with music? For instance, train the model on all Spotify songs and then generate songs based on "Get me a Bach symphony played on sticks with someone rapping like Dr Dre with lisp." Or do music industry have enough money to bully anyone into not doing that?

Reply


@astrange 11 days

Replying to @SleekEagle 🎙

Is there a compare and contrast between Imagen and Parti anywhere? I realize the paper came out yesterday, but maybe other people remember what "autoregressive" means better than I do.

Reply


@aceon48 11 days

Replying to @SleekEagle 🎙

AI is now creative

Reply


@dubswithus 12 days

Replying to @SleekEagle 🎙

If Google has something similar or better it definitely makes it look like OpenAI is wasting its time. None of this relates to AGI.

Reply


@Workaccount2 12 days

Replying to @SleekEagle 🎙

I have shown imagen (and dalle2) to a number of people now (non-tech, just everyday friends, family, co-workers) and I have been pretty stunned by the response I get from most people:

"Meh, that's kinda cool? I guess?" or "What am I looking at?"..."Ok? So a computer made it? That seems neat"

To me I am still trying to get my jaw off the floor from 2 months ago. But the responses have been so muted and shoulder shrugging that I think either I am missing something or they are missing something. Even really drilling in, practically shaking them "DO YOU NOT UNDERSTAND THAT THIS IS A ORIGINAL IMAGE CONSTRUCTED ENTIRELY BY AN AI?!?!" and people just seem to see it as a party trick at best.

Reply


@sagarpatil 11 days

Replying to @SleekEagle 🎙

I wonder how developers can monetise this? What use cases does it have?

Reply


@skinner_ 12 days

Replying to @SleekEagle 🎙

> The central intuition in using T5 is that extremely large language models, by virtue of their sheer size alone, may still learn useful representations despite the fact that they are not explicitly trained with any text/image task in mind. [...] Therefore, the central question being addressed by this choice is whether or not a massive language model trained on a massive dataset independent of the task of image generation is a worthwhile trade-off for a non-specialized text encoder. The Imagen authors bet on the side of the large language model, and it is a bet that seems to pay off well.

The way out of this dilemma is to fine-tune T5 on the caption dataset instead of keeping it frozen. The paper notes that they don't do fine-tuning, but does not provide any ablation or other justification. I wonder if it would help or not.

Reply


@DonHopkins 12 days

Replying to @SleekEagle 🎙

Wait, this isn't about the line of intelligent xeroxographic laser printers developed by Imagen Corporation in 1981, supporting the Impress printer language?

https://tug.org/TUGboat/tb02-2/tb03imagen.pdf

https://www.openprinting.org/driver/imagen

Reply


@coding123 12 days

Replying to @SleekEagle 🎙

Is this by a person that knows or is guessing?

Reply


@natch 12 days

Replying to @SleekEagle 🎙

> Imagen, released just last month, can generate high-quality, high-resolution images given only a description of a scene

“Released”? What? Papers are published. Websites are published. Tools are “released.”

Where has Imagen been released?

Reply


@alexccccc 12 days

Replying to @SleekEagle 🎙

Super interesting

Reply


About Us

site design / logo © 2022 Box Piper