Telegram Group & Telegram Channel
2024-december-transformers.png
904.2 KB
tasty ai papers | december 2024

1️⃣ Byte Latent Transformer: Patches Scale Better Than Tokens

what: train llama on raw bytes without a fixed vocabulary.
- dynamically patches bytes usign local small encoder
- main decoder process these patch in AR setting
- local deocder makes next byte prediction.
paper: https://arxiv.org/abs/2412.09871

2️⃣ Large Concept Models: Language Modeling in a Sentence Representation Space

what: work with entire sentences as "concepts" through SONAR embeddings.
- quite similar with the first paper here, but it merges tokens into high dim embeddings
- working with sentence-level embeddings directly.

paper: https://arxiv.org/abs/2412.08821

3️⃣ GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy

what: Created a diffusion model for probabilistic weather forecasting that generates 15-day predictions with 12-hour steps
how:
- It aggregates two previous timesteps to predict the next weather state
- Instead of directly sampling weather state, it generates residuals (differences) relative to the previous state.
- Артемий в канале AI для Всех сделал ревью на русском, почитайте.

paper: https://www.nature.com/articles/s41586-024-08252-9

my thoughts:
Looks like we're finally getting closer to how humans actually process language, not just crunching tokens like robots. Whether it's patching bytes or bundling tokens into sentence embeddings, this hierarchical approach seems to be the way forward.
GenCast - is just super interesting adoption of modern AI to real problems in natural science.
Please open Telegram to view this post
VIEW IN TELEGRAM



group-telegram.com/neural_cell/225
Create:
Last Update:

tasty ai papers | december 2024

1️⃣ Byte Latent Transformer: Patches Scale Better Than Tokens

what: train llama on raw bytes without a fixed vocabulary.
- dynamically patches bytes usign local small encoder
- main decoder process these patch in AR setting
- local deocder makes next byte prediction.
paper: https://arxiv.org/abs/2412.09871

2️⃣ Large Concept Models: Language Modeling in a Sentence Representation Space

what: work with entire sentences as "concepts" through SONAR embeddings.
- quite similar with the first paper here, but it merges tokens into high dim embeddings
- working with sentence-level embeddings directly.

paper: https://arxiv.org/abs/2412.08821

3️⃣ GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy

what: Created a diffusion model for probabilistic weather forecasting that generates 15-day predictions with 12-hour steps
how:
- It aggregates two previous timesteps to predict the next weather state
- Instead of directly sampling weather state, it generates residuals (differences) relative to the previous state.
- Артемий в канале AI для Всех сделал ревью на русском, почитайте.

paper: https://www.nature.com/articles/s41586-024-08252-9

my thoughts:
Looks like we're finally getting closer to how humans actually process language, not just crunching tokens like robots. Whether it's patching bytes or bundling tokens into sentence embeddings, this hierarchical approach seems to be the way forward.
GenCast - is just super interesting adoption of modern AI to real problems in natural science.

BY the last neural cell


Warning: Undefined variable $i in /var/www/group-telegram/post.php on line 260

Share with your friend now:
group-telegram.com/neural_cell/225

View MORE
Open in Telegram


Telegram | DID YOU KNOW?

Date: |

That hurt tech stocks. For the past few weeks, the 10-year yield has traded between 1.72% and 2%, as traders moved into the bond for safety when Russia headlines were ugly—and out of it when headlines improved. Now, the yield is touching its pandemic-era high. If the yield breaks above that level, that could signal that it’s on a sustainable path higher. Higher long-dated bond yields make future profits less valuable—and many tech companies are valued on the basis of profits forecast for many years in the future. This ability to mix the public and the private, as well as the ability to use bots to engage with users has proved to be problematic. In early 2021, a database selling phone numbers pulled from Facebook was selling numbers for $20 per lookup. Similarly, security researchers found a network of deepfake bots on the platform that were generating images of people submitted by users to create non-consensual imagery, some of which involved children. "We're seeing really dramatic moves, and it's all really tied to Ukraine right now, and in a secondary way, in terms of interest rates," Octavio Marenzi, CEO of Opimas, told Yahoo Finance Live on Thursday. "This war in Ukraine is going to give the Fed the ammunition, the cover that it needs, to not raise interest rates too quickly. And I think Jay Powell is a very tepid sort of inflation fighter and he's not going to do as much as he needs to do to get that under control. And this seems like an excuse to kick the can further down the road still and not do too much too soon." And while money initially moved into stocks in the morning, capital moved out of safe-haven assets. The price of the 10-year Treasury note fell Friday, sending its yield up to 2% from a March closing low of 1.73%. There was another possible development: Reuters also reported that Ukraine said that Belarus could soon join the invasion of Ukraine. However, the AFP, citing a Pentagon official, said the U.S. hasn’t yet seen evidence that Belarusian troops are in Ukraine.
from kr


Telegram the last neural cell
FROM American