group-telegram.com/neural_cell/225
Last Update:
tasty ai papers | december 2024
what: train llama on raw bytes without a fixed vocabulary.
- dynamically patches bytes usign local small encoder
- main decoder process these patch in AR setting
- local deocder makes next byte prediction.
paper: https://arxiv.org/abs/2412.09871
what: work with entire sentences as "concepts" through SONAR embeddings.
- quite similar with the first paper here, but it merges tokens into high dim embeddings
- working with sentence-level embeddings directly.
paper: https://arxiv.org/abs/2412.08821
what: Created a diffusion model for probabilistic weather forecasting that generates 15-day predictions with 12-hour steps
how:
- It aggregates two previous timesteps to predict the next weather state
- Instead of directly sampling weather state, it generates residuals (differences) relative to the previous state.
- Артемий в канале AI для Всех сделал ревью на русском, почитайте.
paper: https://www.nature.com/articles/s41586-024-08252-9
my thoughts:
Looks like we're finally getting closer to how humans actually process language, not just crunching tokens like robots. Whether it's patching bytes or bundling tokens into sentence embeddings, this hierarchical approach seems to be the way forward.
GenCast - is just super interesting adoption of modern AI to real problems in natural science.