
Microsoft AI is on a roll, it seems, and today it announced a more efficient version of its image generation model, called MAI-Image-2-Efficient—or Image-2e for short.
“MAI-Image-2, MAI-Voice-1, and MAI-Transcribe-1 together represent a comprehensive multimedia AI stack purpose-built for developers that spans image generation, natural speech synthesis, and enterprise-grade transcription across 25 languages,” Microsoft’s Naomi Moneypenny writes. “The response from the developer community has been incredible, and we’re not slowing down. Fast on the heels of that launch, we’re thrilled to introduce the next addition to the MAI image generation family: MAI-Image-2-Efficient. It’s now available in public preview in Microsoft Foundry and MAI Playground.”
MAI-Image-2-Efficient is of course built on the same architecture as the well-received MAI-Image-2 image generation model, but it’s been optimized for speed and efficiency. Microsoft says it’s up to 22 percent faster and up to four times more efficient than MAI-Image-2, and that it “outpaces leading text-to-image models by 40 percent on average.”
MAI-Image-2-Efficient is aimed at those who need high-quality image generation at speed and scale, and it’s well-suited for high-volume production workflows for e-commerce, media, and marketing; real-time and conversational experiences like chatbots; and rapid prototyping and creative iterations. The full MAI-Image-2 model is still recommended for when images require precise, detailed text rendering, or when scenes demand the deepest photorealistic contrast and smoothness.
MAI-Image-2-Efficient starts at $5 USD per one million (1M) tokens for text input and $19.50 USD per 1M tokens for image output, Microsoft says.