Назад в будущее, к полным encoder-decoder архитектурам!
Google выложил T5Gemma https://developers.googleblog.com/en/t5gemma/
Google выложил T5Gemma https://developers.googleblog.com/en/t5gemma/
Googleblog
Google for Developers Blog - News about Web, Mobile, AI and Cloud
Explore T5Gemma – a new collection of encoder-decoder LLMs offering superior performance and efficiency – especially for tasks requiring deep input understanding, like summarization and translation, built on Gemma 2 models.
🔥21👀10❤3🥱2
Популярная новость сегодняшнего дня :)
https://www.reuters.com/business/ai-slows-down-some-experienced-software-developers-study-finds-2025-07-10/
Before the study, the open-source developers believed using AI would speed them up, estimating it would decrease task completion time by 24%. Even after completing the tasks with AI, the developers believed that they had decreased task times by 20%. But the study found that using AI did the opposite: it increased task completion time by 19%.
Сорс: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
https://www.reuters.com/business/ai-slows-down-some-experienced-software-developers-study-finds-2025-07-10/
Before the study, the open-source developers believed using AI would speed them up, estimating it would decrease task completion time by 24%. Even after completing the tasks with AI, the developers believed that they had decreased task times by 20%. But the study found that using AI did the opposite: it increased task completion time by 19%.
Сорс: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Reuters
AI slows down some experienced software developers, study finds
Contrary to popular belief, using cutting-edge artificial intelligence tools slowed down experienced software developers when they were working in codebases familiar to them, rather than supercharging their work, a new study found.
😁37😱8🔥4❤2👍2💯1
Интересная архитектурная инновация: трилинейное внимание, где каждому Q соответствует не один K, а два разных. Ценный бонус — более хорошая экспонента для скейлинга, что значит можно обучать более хорошие модели на том же количестве данных.
https://www.group-telegram.com/gonzo_ML.com_podcasts/436
https://www.group-telegram.com/gonzo_ML.com_podcasts/436
Telegram
gonzo_ML_podcasts
Fast and Simplex: 2-Simplicial Attention in Triton
Aurko Roy, Timothy Chou, Sai Surya Duvvuri, Sijia Chen, Jiecao Yu, Xiaodong Wang, Manzil Zaheer, Rohan Anil
Статья: https://arxiv.org/abs/2507.02754
Англ версия: https://arxiviq.substack.com/p/fast-and-simplex…
Aurko Roy, Timothy Chou, Sai Surya Duvvuri, Sijia Chen, Jiecao Yu, Xiaodong Wang, Manzil Zaheer, Rohan Anil
Статья: https://arxiv.org/abs/2507.02754
Англ версия: https://arxiviq.substack.com/p/fast-and-simplex…
1👍18❤2
В шаббат разбирать статьи не будем, но вот вам на почитать, если ещё не видели.
Шмидхубух про историю современного AI.
https://people.idsia.ch/~juergen/deep-learning-history.html
Шмидхубух про историю современного AI.
https://people.idsia.ch/~juergen/deep-learning-history.html
people.idsia.ch
Timeline: artificial neural networks, deep learning, etc
Annotated history of modern AI and deep learning
🔥14❤2👍2👎2🤡1
В опенсорсе модель с 1T параметров! Для тех, у кого лишние DGX простаивают, видимо :)
https://github.com/MoonshotAI/Kimi-K2
Обучена оптимизатором muon (https://www.group-telegram.com/gonzo_ML.com/3591), кстати.
https://github.com/MoonshotAI/Kimi-K2
Обучена оптимизатором muon (https://www.group-telegram.com/gonzo_ML.com/3591), кстати.
GitHub
GitHub - MoonshotAI/Kimi-K2: Kimi K2 is the large language model series developed by Moonshot AI team
Kimi K2 is the large language model series developed by Moonshot AI team - MoonshotAI/Kimi-K2
❤5