We’ve seen a lot of machine learning systems create strange new phrases and dreamlike images after being trained on large amounts of data. But a new website lets you do the generating, and the results are just as bizarre as you’d expect.
The web applet, built by researcher Cristóbal Valenzuela, is based on a new paper from another team of researchers. Their machine learning algorithm is called AttnGAN, (Attentional Generative Adversarial Network). It’s meant to improve upon other text-to-image AI by refining images at the word level.
For now, the results are closer to surrealist art:
Machine learning, as you probably know by now, is the process researchers use to train algorithms on large datasets, allowing them to solve complex problems such as “what is this a picture of?” on their own. These algorithms can also do the opposite, creating new images out of words.
The new paper explains that older text-to-image programs formed images using entire sentences, which wasn’t great. Their method instead creates a general image from the entire sentence, then refines the image using the sentence’s sub-parts.
The researchers trained the network on the COCO, or Common Objects in Context dataset. It’s a good reference source for images of common objects, such as stop signs, animals and… Modest Mouse lyrics.
Valenzuela’s tool excelled at creating the stuff of fever dreams in response to Gizmodo staffers’ twisted requests. Our own Hudson Hongo got especially good at getting the images he wanted.
Unsurprisingly, Janelle Shane’s AI Weirdness blog is where we found out about AttnGAN, so we asked her what it says about the current state of AI.
“This demo is a really interesting way of showing how much a state of the art image recognition algorithm understands about image and text,” she told Gizmodo. “What does it understand about what ‘dog’ means? Or ‘human’?” But she noticed that structure is difficult for these algorithms. “If it sees a human arm pointing toward it vs to the side, it looks really different in a 2D image.”
Shane also pointed out that the algorithm drew birds really well when it only needed to draw birds, but things got worse as more became expected of it — the version of AttnGAN on Valenzuela’s site tries to draw whatever a user types in. She compared it to self-driving cars, who have many more tasks they need to do and obstacles they need recognise.
Gizmodo reached out to the study’s first author, PhD student Tao Xu at Lehigh University, but had not heard back at time of writing.
But please, have fun with this one.
As a final thought, these would make really good Dixit cards.