New LLM Pre-training and Post-training Paradigms

Instruction Pretraining LLMs Building LLMs from the Ground Up: A 3...

New LLM Pre-training and Post-training Paradigms

There are hundreds of LLM papers each month proposing new techniques and approaches. However, one of the best ways to see what actually works well in practice is to look at the pre-training and post-training pipelines of the most recent state-of-the-art models. Luckily, four major new LLMs have been released in the last months, accompanied by relatively detailed technical reports. In this article, I focus on the pre-training and post-training pipelines of the following models: Alibaba's Qwen 2, Apple Intelligence Foundation Language Models, Google's Gemma 2, Meta AI's Llama 3.1.

Like • 0 comments • flag

Published on August 16, 2024 23:03

No comments have been added yet.

Sebastian Raschka's Blog

Sebastian Raschka's profile
141 followers

Sebastian Raschka isn't a 欧宝娱乐 Author (yet), but they do have a blog, so here are some recent posts imported from their feed.

Follow Sebastian Raschka's blog with rss.

delete edit this post