Meta presents flame 4: the new generation of its multimodal AI with specialized versions in concrete tasks

Goal has expanded its family of large -scale language models (LLM) with the presentation of Call 4. The technological has developed three versions of this multimodal AI (Scout, Maverick and Behemoth), with which he intends to help developers and professionals to build more personalized experiences.

An aspect to highlight of the models calls 4 is that they are the first of the goal that They use a mixture architecture of experts (MOE). This means that, instead of having a single neuronal network that processs all the information, the AI ​​has an advanced neuronal network composed of several “experts”, which are subnets or models specialized in concrete tasks.

Thanks to this, only the necessary “experts” will be activated for each consultation, improving efficiency and reducing latency or response time. Other IAS that are based on moe are Deepseek v3, Qwen2.5-Max or Gemini 1.5 Pro.

The models Call 4 scout and call 4 maverick can now be downloaded on flame.com and Hugging Face. In addition, technological has affirmed that, in the next few days, they will also be available through its partners. On the other hand, the technology of Call 4 has been activated in goal AIboth on its website and on WhatsApp, Messenger and Instagram.

However, goal makes it very clear in its policy of use that the rights of use and distribution of flame 4 «They are not granted if you are a natural person or a main headquarters in the European Union. This restriction does not apply to the end users of a product or service that incorporates these multimodal models ».

On April 29, within the framework of its event for developers Llamas, the technological giant plans to share more information about flame 4.

Models call 4

Llama 4 models have been designed with native multimodality, being able to understand and generate text, images and even video as part of its main operation, and not as an added capacity afterwards.

Besides, incorporates “early fusion”a technique that allows combining the different data (whether text, images or video), from the first layers of the model, instead of processing them separately. “Early fusion is a great advance, since it allows us to jointly pre -exist the model with large amounts of text data, image and video without labeling,” Meta explains.

Likewise, Technological The vision encoder in flame 4 has improvedwhich is based on Metaclip, but it has entered with another model calls to adapt it better.

Call 4 scout

This is the smaller flame version 4but it is even more powerful than all the flame models of previous generations. It has been designed to function with a single GPU and, according to Meta, has obtained better results than other models such as Gemma 3, Gemini 2.0 Flash-lite and Mistral 3.1 in various tests.

Call 4 scout is a model of 17,000 million active parameters and 16 “experts” which offers a 10 million tokens context window.

Recall that the parameters are the internal values ​​that a model learns during training. They are not the set of data itself, but the controls and indications integrated into your system that allow it to define how to process and transform the information it analyzes.

Call 4 maverick

For its part, it calls 4 maverick it is a model of 17,000 million active parameters and 128 “experts”. Meta states that “it is the best multimodal model in its class, surpassing GPT-4O and Gemini 2.0 flash in a wide range of benchmarks widely disseminated, while achieving results comparable to those of the new Deepseek V3 in reasoning and coding, with less than half of the active parameters.”

This model also stands out for its Improved images and text understanding capabilities. Likewise, it offers a great performance-Coste ratio, with an experimental chat version with an Elo of 1,417 score in Lmarena.

Call 4 Behemoth

This is the model that It has served as a guide for the creation of Scout and Maverickand that Meta states that it will continue to function as such for upcoming versions. Technology catalogs it as “one of the most intelligent LLM models in the world and the most powerful to date.”

At the moment, call 4 behemoth It is still in developmentbut already obtains results that exceed GPT-4.5, Claude Sonnet 3.7 and Gemini 2.0 Pro in Benchmarks focused on Stem. It has 288,000 million active parameters and has 16 “experts” and almost 2 billion total parameters.

Photo: GPT-4O