Researchers at Microsoft are working on a new artificial intelligence system that’s capable of drawing images based on text descriptions. The new AI, simply called the drawing bot, is based on a technology called a Generative Adversarial Network (GAN), which consists of two different machine learning models.
The first one is used to actually generate the images from text descriptions — while the other, known as the discriminator, is used to score the authenticity of the generated image. These two models work together in order to achieve the best possible accuracy possible in the final drawing.
Microsoft’s drawing bot doesn’t actually use a basic GAN — instead, the company’s researcher designed a new system called the Attentional GAN, or AttnGAN which is capable of perfecting the drawing using the provided description. Microsoft says the regular GAN would not be able to draw pixel-perfect or sharp images based on descriptions where there are a variety of different colours, so the AttnGAN is being used in order to tackle the problem by effectively picking out the key variables from the provided description and matching them against the drawing.
Microsoft’s AttnGAN isn’t just only about improving the accuracy, and it also has the basic common sense of humans. Like every other AI systems, Microsoft used a lot of training data in order to train the models required by the AttnGAN, allowing the system to pick up important details that will be useful when drawing images. “Since many images of birds in the training data show birds sitting on tree branches, the AttnGAN usually draws birds sitting on branches unless the text specifies otherwise,” Microsoft said in a blog post.
Microsoft says the drawing bot could one day be used as sketch assistants to painters, or even help filmmakers save time and money by drawing animation scenes based on the screenplay. For now though, it’s still a work in progress.