DALL-E is an artificial intelligence (AI) program trained to create images with distinct details from descriptive texts. It has already shown promising results, but its behavioral failure suggests that applying its algorithm to running applications may take some time.
AI algorithms tend to be more volatile when it comes to image production due to obsolescence in the data sets used in their training. However, DALL-E came up with a logical interpretation not only of practical things but also of abstract ideas. For example, in a caption describing a capybara in the field at sunrise, AI has surprisingly demonstrated logical reasoning by providing images of a topic that makes its image beyond those specific details specified in the text. It even managed to show good judgment in curing mysterious, imaginary ideas, such as creating a harp-shaped snail by narrating a hollowed-out part of a snail's shell, and by skillfully combining the two into one.
DALL-E tends to get frustrated with long strings of text, however, it is less accurate with additional meanings. AI is also a victim of traditional superstitions, such as the production of Chinese food as dumplings. Of course, once completed, there are a number of applications for such a tool, from marketing and building concepts to viewing news boards from structure summaries. Perhaps AI algorithms like DALL-E could be much better than humans in drawing the same way they have given us in air combat.
Why Important: New models are the latest in a series of ongoing efforts to create digital learning systems that reflect common sense while doing useful work in the real world - without compromising computer power.
What’s happening: OpenAI today announces two new systems that are trying to make images of what its landmark GPT-3 model did last year for text-making.
DALL-E is a neural network "that can take any text and make a picture of it," said Ilya Sutskever, founder of OpenAI and senior scientist. That includes ideas he would never encounter in training, such as drawing anthropomorphic daikon radish with the dog shown above.
Flashback: DALL-E works similarly to the GPT-3, a large transformer model that can produce the first roles of text in terms of a short Prompt.
CLIP, another new neural network, "can take any visual cues and create robust and reliable textual interpretations," says Sutskever, improving existing computer-based techniques with minimal training and expensive computer power.
They say: "Last year, we were able to make significant progress in writing with the GPT-3, but the thing is that the world is not just made of text," Sutskever said. "This is a step towards the ultimate goal of building a neural network that can work on both images and text."
How it works: DALL-E - an OpenAI name taken as a portmanteau by surrealist artist Salvador Dali and the excellent robot of Wighty WALL-E - a model that comes out because it aims to fulfill the Star Trek's dream of simply being able to tell a computer, using common language, that does it.
For example: In the picture above, the text is a green pentagon. Any change in these three elements here - shape (pentagon), color (green), object (frame), will produce a different set of images.