Telegram Group & Telegram Channel
Forwarded from Machinelearning
🌟MiniMax-M1: открытя reasoning‑LLM с контекстом 1M

MiniMax-M1 — первая в мире open-weight гибридная reasoning‑LLM c 1M контекстом (8× DeepSeek R1) и гибридной архитектурой MoE + lightning attention.
• 456 млрд параметров (45,9 млрд активируются на токен), сверхэффективная генерация — 25% FLOPs DeepSeek R1 на 100K токенов
• Обучение через RL с новым алгоритмом CISPO, решающим реальные задачи от математики до кодинга
• На обучение было потрачено $534K, две версии — 40K/80K “thinking budget”
• Обходит DeepSeek R1 и Qwen3-235B на бенчмарках по математике и кодингу,
• Топ результат на задачах для software engineering и reasoning



Бенчмарки:
AIME 2024: 86.0 (M1-80K) vs 85.7 (Qwen3) vs 79.8 (DeepSeek R1)

SWE-bench Verified: 56.0 vs 34.4 (Qwen3)

OpenAI-MRCR (128k): 73.4 vs 27.7 (Qwen3)

TAU-bench (airline): 62.0 vs 34.7 (Qwen3)

LongBench-v2: 61.5 vs 50.1 (Qwen3)


➡️ Попробовать можно здесь

Hugging Face: https://huggingface.co/collections/MiniMaxAI/minimax-m1-68502ad9634ec0eeac8cf094
GitHub: https://github.com/MiniMax-AI/MiniMax-M1
Tech Report: https://github.com/MiniMax-AI/MiniMax-M1/blob/main/MiniMax_M1_tech_report.pdf


@ai_machinelearning_big_data

#llm #reasoningmodels #minimaxm1
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/machinelearning_interview/1862
Create:
Last Update:

🌟MiniMax-M1: открытя reasoning‑LLM с контекстом 1M

MiniMax-M1 — первая в мире open-weight гибридная reasoning‑LLM c 1M контекстом (8× DeepSeek R1) и гибридной архитектурой MoE + lightning attention.
• 456 млрд параметров (45,9 млрд активируются на токен), сверхэффективная генерация — 25% FLOPs DeepSeek R1 на 100K токенов
• Обучение через RL с новым алгоритмом CISPO, решающим реальные задачи от математики до кодинга
• На обучение было потрачено $534K, две версии — 40K/80K “thinking budget”
• Обходит DeepSeek R1 и Qwen3-235B на бенчмарках по математике и кодингу,
• Топ результат на задачах для software engineering и reasoning



Бенчмарки:
AIME 2024: 86.0 (M1-80K) vs 85.7 (Qwen3) vs 79.8 (DeepSeek R1)

SWE-bench Verified: 56.0 vs 34.4 (Qwen3)

OpenAI-MRCR (128k): 73.4 vs 27.7 (Qwen3)

TAU-bench (airline): 62.0 vs 34.7 (Qwen3)

LongBench-v2: 61.5 vs 50.1 (Qwen3)


➡️ Попробовать можно здесь

Hugging Face: https://huggingface.co/collections/MiniMaxAI/minimax-m1-68502ad9634ec0eeac8cf094
GitHub: https://github.com/MiniMax-AI/MiniMax-M1
Tech Report: https://github.com/MiniMax-AI/MiniMax-M1/blob/main/MiniMax_M1_tech_report.pdf


@ai_machinelearning_big_data

#llm #reasoningmodels #minimaxm1

BY Machine learning Interview





Share with your friend now:
group-telegram.com/machinelearning_interview/1862

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

Apparently upbeat developments in Russia's discussions with Ukraine helped at least temporarily send investors back into risk assets. Russian President Vladimir Putin said during a meeting with his Belarusian counterpart Alexander Lukashenko that there were "certain positive developments" occurring in the talks with Ukraine, according to a transcript of their meeting. Putin added that discussions were happening "almost on a daily basis." NEWS Friday’s performance was part of a larger shift. For the week, the Dow, S&P 500 and Nasdaq fell 2%, 2.9%, and 3.5%, respectively. "The argument from Telegram is, 'You should trust us because we tell you that we're trustworthy,'" Maréchal said. "It's really in the eye of the beholder whether that's something you want to buy into." Additionally, investors are often instructed to deposit monies into personal bank accounts of individuals who claim to represent a legitimate entity, and/or into an unrelated corporate account. To lend credence and to lure unsuspecting victims, perpetrators usually claim that their entity and/or the investment schemes are approved by financial authorities.
from ye


Telegram Machine learning Interview
FROM American