Today, MiniMax has released and open-sourced the all-new MiniMax-01 series of models, which includes two models: the foundational language model MiniMax-Text-01 and the visual multi-modal model MiniMax-VL-01.
Report link: <a href="https://filecdn.minimax.chat/_Arxiv_MiniMax_01_Report.pdf" class="autolinkedURL autolinkedURL-url" target="_blank">filecdn.minimax.chat/_Arxiv_MiniMax_01_Repo..</a>
Innovative Lightning Attention Architecture, with Top-tier Model Performance
In the MiniMax-01 series, we have made a bold innovation: for the first time on a large scale, we have implemented a novel Lightning Attention mechanism, offering a new alternative to the traditional Transformer architecture. This model boasts a staggering 456 billion parameters, with 45.9 billion parameters activated per inference. Its comprehensive performance is on par with the leading models globally, while it efficiently handles the world's longest context length of up to 4 million tokens—20 to 32 times more than other leading models.

Today, MiniMax has released and open-sourced the all-new MiniMax-01 series of models, which includes two models: the foundational language model MiniMax-Text-01 and the visual multi-modal model MiniMax-VL-01.

Report link: https://filecdn.minimax.chat/\_Arxiv\_MiniMax\_01\_Report.pdf

Innovative Lightning Attention Architecture, with Top-tier Model Performance

In the MiniMax-01 series, we have made a bold innovation: for the first time on a large scale, we have implemented a novel Lightning Attention mechanism, offering a new alternative to the traditional Transformer architecture. This model boasts a staggering 456 billion parameters, with 45.9 billion parameters activated per inference. Its comprehensive performance is on par with the leading models globally, while it efficiently handles the world's longest context length of up to 4 million tokens—20 to 32 times more than other leading models.

MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era