Telegram Group & Telegram Channel
πŸ”₯ Π Π΅Π»ΠΈΠ· Qwen 3 ΠΎΡ‚ Alibaba

Π’ Ρ€Π΅Π»ΠΈΠ· вошли 2 MoE-ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΈ 6 Dense models (ΠΏΠ»ΠΎΡ‚Π½Ρ‹Π΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ), Ρ€Π°Π·ΠΌΠ΅Ρ€ΠΎΠΌ ΠΎΡ‚ 0.6B Π΄ΠΎ 235B ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ².

πŸ† Ѐлагманская модСль Qwen3-235B-A22B дСмонстрируСт ΠΊΠΎΠ½ΠΊΡƒΡ€Π΅Π½Ρ‚Π½Ρ‹Π΅ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ Π² Π·Π°Π΄Π°Ρ‡Π°Ρ… Кодина, ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ ΠΈ ΠΎΠ±Ρ‰ΠΈΡ… способностСй, ΡƒΠ²Π΅Ρ€Π΅Π½Π½ΠΎ сопСрничая с ΠΏΠ΅Ρ€Π΅Π΄ΠΎΠ²Ρ‹ΠΌΠΈ модСлями, Ρ‚Π°ΠΊΠΈΠΌΠΈ ΠΊΠ°ΠΊ DeepSeek-R1, o1, o3-mini, Grok-3 ΠΈ Gemini-2.5-Pro.
⚑ НСбольшая MoE-модСль Qwen3-30B-A3B прСвосходит QwQ-32B,  ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽ Π² 10 Ρ€Π°Π· мСньшС ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ².
πŸ”₯ ΠšΠΎΠΌΠΏΠ°ΠΊΡ‚Π½Π°Ρ модСль Qwen3-4B сопоставима ΠΏΠΎ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ с Qwen2.5-72B-Instruct.
🧠 ΠŸΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΈΠ²Π°Π΅Ρ‚ Π³ΠΈΠ±Ρ€ΠΈΠ΄Π½Ρ‹ΠΉ Ρ€Π΅ΠΆΠΈΠΌ ΠΌΡ‹ΡˆΠ»Π΅Π½ΠΈΡ

Π Π΅ΠΆΠΈΠΌ Ρ€Π°Π·ΠΌΡ‹ΡˆΠ»Π΅Π½ΠΈΡ активируСтся ΠΏΡ€ΠΈ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ΅ слоТных Π·Π°Π΄Π°Ρ‡, обСспСчивая ΠΏΠΎΡˆΠ°Π³ΠΎΠ²Ρ‹ΠΉ Π°Π½Π°Π»ΠΈΠ· запроса ΠΈ Ρ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ комплСксных, Π³Π»ΡƒΠ±ΠΎΠΊΠΈΡ… ΠΎΡ‚Π²Π΅Ρ‚ΠΎΠ².

Π‘Π°Π·ΠΎΠ²Ρ‹ΠΉ Ρ€Π΅ΠΆΠΈΠΌ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ для повсСднСвных вопросов, позволяя Π²Ρ‹Π΄Π°Π²Π°Ρ‚ΡŒ быстрыС ΠΈ Ρ‚ΠΎΡ‡Π½Ρ‹Π΅ ΠΎΡ‚Π²Π΅Ρ‚Ρ‹ с минимальной Π·Π°Π΄Π΅Ρ€ΠΆΠΊΠΎΠΉ.

ΠŸΡ€ΠΎΡ†Π΅ΡΡ обучСния ΠΌΠΎΠ΄Π΅Π»ΠΈ устроСн ΠΏΠΎΡ…ΠΎΠΆΠΈΠΌ ΠΎΠ±Ρ€Π°Π·ΠΎΠΌ Π½Π° Ρ‚ΠΎ, ΠΊΠ°ΠΊ это сдСлано Π² DeepSeek R1.

ΠŸΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΈΠ²Π°Π΅Ρ‚ 119 языков, Π²ΠΊΠ»ΡŽΡ‡Π°Ρ русский.

Π›ΠΈΡ†Π΅Π½Π·ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅: Apache 2.0 πŸ”₯

πŸ”œΠŸΠΎΠΏΡ€ΠΎΠ±ΠΎΠ²Π°Ρ‚ΡŒ: https://chat.qwen.ai/
πŸ”œBlog: https://qwenlm.github.io/blog/qwen3/
πŸ”œGitHub: https://github.com/QwenLM/Qwen3
πŸ”œHugging Face: https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f
πŸ”œ ModelScope: https://modelscope.cn/collections/Qwen3-9743180bdc6b48

@ai_machinelearning_big_data

#Qwen
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/ai_machinelearning_big_data/7469
Create:
Last Update:

πŸ”₯ Π Π΅Π»ΠΈΠ· Qwen 3 ΠΎΡ‚ Alibaba

Π’ Ρ€Π΅Π»ΠΈΠ· вошли 2 MoE-ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΈ 6 Dense models (ΠΏΠ»ΠΎΡ‚Π½Ρ‹Π΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ), Ρ€Π°Π·ΠΌΠ΅Ρ€ΠΎΠΌ ΠΎΡ‚ 0.6B Π΄ΠΎ 235B ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ².

πŸ† Ѐлагманская модСль Qwen3-235B-A22B дСмонстрируСт ΠΊΠΎΠ½ΠΊΡƒΡ€Π΅Π½Ρ‚Π½Ρ‹Π΅ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ Π² Π·Π°Π΄Π°Ρ‡Π°Ρ… Кодина, ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ ΠΈ ΠΎΠ±Ρ‰ΠΈΡ… способностСй, ΡƒΠ²Π΅Ρ€Π΅Π½Π½ΠΎ сопСрничая с ΠΏΠ΅Ρ€Π΅Π΄ΠΎΠ²Ρ‹ΠΌΠΈ модСлями, Ρ‚Π°ΠΊΠΈΠΌΠΈ ΠΊΠ°ΠΊ DeepSeek-R1, o1, o3-mini, Grok-3 ΠΈ Gemini-2.5-Pro.
⚑ НСбольшая MoE-модСль Qwen3-30B-A3B прСвосходит QwQ-32B,  ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽ Π² 10 Ρ€Π°Π· мСньшС ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ².
πŸ”₯ ΠšΠΎΠΌΠΏΠ°ΠΊΡ‚Π½Π°Ρ модСль Qwen3-4B сопоставима ΠΏΠΎ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ с Qwen2.5-72B-Instruct.
🧠 ΠŸΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΈΠ²Π°Π΅Ρ‚ Π³ΠΈΠ±Ρ€ΠΈΠ΄Π½Ρ‹ΠΉ Ρ€Π΅ΠΆΠΈΠΌ ΠΌΡ‹ΡˆΠ»Π΅Π½ΠΈΡ

Π Π΅ΠΆΠΈΠΌ Ρ€Π°Π·ΠΌΡ‹ΡˆΠ»Π΅Π½ΠΈΡ активируСтся ΠΏΡ€ΠΈ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ΅ слоТных Π·Π°Π΄Π°Ρ‡, обСспСчивая ΠΏΠΎΡˆΠ°Π³ΠΎΠ²Ρ‹ΠΉ Π°Π½Π°Π»ΠΈΠ· запроса ΠΈ Ρ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ комплСксных, Π³Π»ΡƒΠ±ΠΎΠΊΠΈΡ… ΠΎΡ‚Π²Π΅Ρ‚ΠΎΠ².

Π‘Π°Π·ΠΎΠ²Ρ‹ΠΉ Ρ€Π΅ΠΆΠΈΠΌ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ для повсСднСвных вопросов, позволяя Π²Ρ‹Π΄Π°Π²Π°Ρ‚ΡŒ быстрыС ΠΈ Ρ‚ΠΎΡ‡Π½Ρ‹Π΅ ΠΎΡ‚Π²Π΅Ρ‚Ρ‹ с минимальной Π·Π°Π΄Π΅Ρ€ΠΆΠΊΠΎΠΉ.

ΠŸΡ€ΠΎΡ†Π΅ΡΡ обучСния ΠΌΠΎΠ΄Π΅Π»ΠΈ устроСн ΠΏΠΎΡ…ΠΎΠΆΠΈΠΌ ΠΎΠ±Ρ€Π°Π·ΠΎΠΌ Π½Π° Ρ‚ΠΎ, ΠΊΠ°ΠΊ это сдСлано Π² DeepSeek R1.

ΠŸΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΈΠ²Π°Π΅Ρ‚ 119 языков, Π²ΠΊΠ»ΡŽΡ‡Π°Ρ русский.

Π›ΠΈΡ†Π΅Π½Π·ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅: Apache 2.0 πŸ”₯

πŸ”œΠŸΠΎΠΏΡ€ΠΎΠ±ΠΎΠ²Π°Ρ‚ΡŒ: https://chat.qwen.ai/
πŸ”œBlog: https://qwenlm.github.io/blog/qwen3/
πŸ”œGitHub: https://github.com/QwenLM/Qwen3
πŸ”œHugging Face: https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f
πŸ”œ ModelScope: https://modelscope.cn/collections/Qwen3-9743180bdc6b48

@ai_machinelearning_big_data

#Qwen

BY Machinelearning





Share with your friend now:
group-telegram.com/ai_machinelearning_big_data/7469

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

False news often spreads via public groups, or chats, with potentially fatal effects. The regulator took order for the search and seizure operation from Judge Purushottam B Jadhav, Sebi Special Judge / Additional Sessions Judge. That hurt tech stocks. For the past few weeks, the 10-year yield has traded between 1.72% and 2%, as traders moved into the bond for safety when Russia headlines were uglyβ€”and out of it when headlines improved. Now, the yield is touching its pandemic-era high. If the yield breaks above that level, that could signal that it’s on a sustainable path higher. Higher long-dated bond yields make future profits less valuableβ€”and many tech companies are valued on the basis of profits forecast for many years in the future. Anastasia Vlasova/Getty Images This ability to mix the public and the private, as well as the ability to use bots to engage with users has proved to be problematic. In early 2021, a database selling phone numbers pulled from Facebook was selling numbers for $20 per lookup. Similarly, security researchers found a network of deepfake bots on the platform that were generating images of people submitted by users to create non-consensual imagery, some of which involved children.
from tw


Telegram Machinelearning
FROM American