In recent months, Meta has become more and more interested in generative AI. Many models were then released with the aim of competing with MidJourney and Dall-E.
Recently, Meta launched an image-generating intelligence dubbed CM3leon. According to the company, this new AI-powered generator would be able to convert text to image and vice versa. This is news that quickly aroused the interest of users. Especially those who have become accustomed to using OpenAI's DALL-E 2.
How does Meta plan to make a difference with the launch of CM3leon?
The launch of CM3leon by Meta plays an important role in the proliferation of generative AIs in the current market. It is also a new system that is attracting the interest of both large companies and start-ups. And given that older models of generative AI are not yet exploiting their full potential, especially with the accuracy and reliability problems of GPT-4 at the moment, Meta has been able to develop a different approach than that adopted by OpenAI.
What differentiates Meta's CM3leon from other image generators, including OpenAI's DALL-E 2, is that it is based on a transformer model called "Attention". It is through this approach that CM3leon can process images with increased velocity, subsequently resulting in reduced processing cost.
The first model capable of performing a dual task
The biggest advantage of Meta's CM3leon template is its ability to generate text and images at the same time. Which is not possible with older generative AI models such as DALL-E 2 or MidJourney, two tools that are limited only to image generation. Each image can indeed be accompanied by a caption and CM3leon can indeed generate a sequence of texts. At this point, it positions itself as an AI-generated tool capable of performing double duty.
Meta CM3leon vs DALL-E 2
With a configuration of 7 billion parameters, Meta CM3leon far surpasses DALL-E 2 which has only 3.5 billion. As for his apprenticeship, it was trained on several million licensed images from Shutterstock. In other words, it was able to benefit from a solid learning base compared to DALL-E and DALL-E 2.
But the advantages of Meta's CM3leon are not limited to its ability to generate both text and images. The length of the captions can vary depending on the questions and the answers to each question, i.e. each prompt. The examples provided by Meta show that CM3leon is able to describe an image in detail, an ability that surpasses even models specialized in image captioning.
Introducing CM3leon, a first-of-its-kind multimodal model that achieves state-of-the-art performance for text-to-image generation with 5x the compute efficiency of competitive models.
— Meta AI (@MetaAI) July 14, 2023
More details ➡️ https://t.co/VR12zkmLDs pic.twitter.com/jUnG7G1Fxf
In sum, the introduction of CM3leon by Meta represents a significant step forward in the field of AI-powered image generators. Thanks to its learning and its 7 billion parameters, as well as its dual ability to generate text and images, it will quickly dethrone MidJourney and DALL-E 2.

Comments
Post a Comment