Telegram Group & Telegram Channel
2024-december-transformers.png
904.2 KB
tasty ai papers | december 2024

1️⃣ Byte Latent Transformer: Patches Scale Better Than Tokens

what: train llama on raw bytes without a fixed vocabulary.
- dynamically patches bytes usign local small encoder
- main decoder process these patch in AR setting
- local deocder makes next byte prediction.
paper: https://arxiv.org/abs/2412.09871

2️⃣ Large Concept Models: Language Modeling in a Sentence Representation Space

what: work with entire sentences as "concepts" through SONAR embeddings.
- quite similar with the first paper here, but it merges tokens into high dim embeddings
- working with sentence-level embeddings directly.

paper: https://arxiv.org/abs/2412.08821

3️⃣ GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy

what: Created a diffusion model for probabilistic weather forecasting that generates 15-day predictions with 12-hour steps
how:
- It aggregates two previous timesteps to predict the next weather state
- Instead of directly sampling weather state, it generates residuals (differences) relative to the previous state.
- Артемий в канале AI для Всех сделал ревью на русском, почитайте.

paper: https://www.nature.com/articles/s41586-024-08252-9

my thoughts:
Looks like we're finally getting closer to how humans actually process language, not just crunching tokens like robots. Whether it's patching bytes or bundling tokens into sentence embeddings, this hierarchical approach seems to be the way forward.
GenCast - is just super interesting adoption of modern AI to real problems in natural science.
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/neural_cell/225
Create:
Last Update:

tasty ai papers | december 2024

1️⃣ Byte Latent Transformer: Patches Scale Better Than Tokens

what: train llama on raw bytes without a fixed vocabulary.
- dynamically patches bytes usign local small encoder
- main decoder process these patch in AR setting
- local deocder makes next byte prediction.
paper: https://arxiv.org/abs/2412.09871

2️⃣ Large Concept Models: Language Modeling in a Sentence Representation Space

what: work with entire sentences as "concepts" through SONAR embeddings.
- quite similar with the first paper here, but it merges tokens into high dim embeddings
- working with sentence-level embeddings directly.

paper: https://arxiv.org/abs/2412.08821

3️⃣ GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy

what: Created a diffusion model for probabilistic weather forecasting that generates 15-day predictions with 12-hour steps
how:
- It aggregates two previous timesteps to predict the next weather state
- Instead of directly sampling weather state, it generates residuals (differences) relative to the previous state.
- Артемий в канале AI для Всех сделал ревью на русском, почитайте.

paper: https://www.nature.com/articles/s41586-024-08252-9

my thoughts:
Looks like we're finally getting closer to how humans actually process language, not just crunching tokens like robots. Whether it's patching bytes or bundling tokens into sentence embeddings, this hierarchical approach seems to be the way forward.
GenCast - is just super interesting adoption of modern AI to real problems in natural science.

BY the last neural cell


Warning: Undefined variable $i in /var/www/group-telegram/post.php on line 260

Share with your friend now:
group-telegram.com/neural_cell/225

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

Stocks closed in the red Friday as investors weighed upbeat remarks from Russian President Vladimir Putin about diplomatic discussions with Ukraine against a weaker-than-expected print on U.S. consumer sentiment. In a message on his Telegram channel recently recounting the episode, Durov wrote: "I lost my company and my home, but would do it again – without hesitation." As a result, the pandemic saw many newcomers to Telegram, including prominent anti-vaccine activists who used the app's hands-off approach to share false information on shots, a study from the Institute for Strategic Dialogue shows. Telegram does offer end-to-end encrypted communications through Secret Chats, but this is not the default setting. Standard conversations use the MTProto method, enabling server-client encryption but with them stored on the server for ease-of-access. This makes using Telegram across multiple devices simple, but also means that the regular Telegram chats you’re having with folks are not as secure as you may believe. In view of this, the regulator has cautioned investors not to rely on such investment tips / advice received through social media platforms. It has also said investors should exercise utmost caution while taking investment decisions while dealing in the securities market.
from hk


Telegram the last neural cell
FROM American