Sebastian Raschka's Blog
April 18, 2025
The State of Reinforcement Learning for LLM Reasoning
A lot has happened this month, especially with the releases of new flagship models like GPT-4.5 and Llama 4. But you might have noticed that reactions to these releases were relatively muted. Why? One reason could be that GPT-4.5 and Llama 4 remain conventional models, which means they were trained without explicit reinforcement learning for reasoning. However, OpenAI's recent release of the o3 reasoning model demonstrates there is still considerable room for improvement when investing compute strategically, specifically via reinforcement learning methods tailored for reasoning tasks. While reasoning alone isn't a silver bullet, it reliably improves model accuracy and problem-solving capabilities on challenging tasks (so far). And I expect reasoning-focused post-training to become standard practice in future LLM pipelines. So, in this article, let's explore the latest developments in reasoning via reinforcement learning.
Published on April 18, 2025 17:00
March 28, 2025
First Look at Reasoning From Scratch: Chapter 1
As you know, I've been writing a lot lately about the latest research on reasoning in LLMs. Before my next research-focused blog post, I wanted to offer something special to my paid subscribers as a thank-you for your ongoing support. So, I've started writing a new book on how reasoning works in LLMs, and here I'm sharing the first Chapter 1 with you. This ~15-page chapter is an introduction reasoning in the context of LLMs and provides an overview of methods like inference-time scaling and reinforcement learning. Thanks for your support! I hope you enjoy the chapter, and stay tuned for my next blog post on reasoning research!
Published on March 28, 2025 23:03
March 7, 2025
Inference-Time Compute Scaling Methods to Improve Reasoning Models
This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.
Published on March 07, 2025 22:03
The State of LLM Reasoning Models
This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.
Published on March 07, 2025 22:03
February 4, 2025
Understanding Reasoning LLMs
In this article, I will describe the four main approaches to building reasoning models, or how we can enhance LLMs with reasoning capabilities. I hope this provides valuable insights and helps you navigate the rapidly evolving literature and hype surrounding this topic.
Published on February 04, 2025 22:03
January 22, 2025
Noteworthy LLM Research Papers of 2024
This article covers 12 influential AI research papers of 2024, ranging from mixture-of-experts models to new LLM scaling laws for precision.
Published on January 22, 2025 22:03
January 16, 2025
Implementing A Byte Pair Encoding (BPE) Tokenizer From Scratch
This is a standalone notebook implementing the popular byte pair encoding (BPE) tokenization algorithm, which is used in models like GPT-2 to GPT-4, Llama 3, etc., from scratch for educational purposes."
Published on January 16, 2025 22:03
December 28, 2024
LLM Research Papers: The 2024 List
I want to share my running bookmark list of many fascinating (mostly LLM-related) papers I stumbled upon in 2024. It's just a list, but maybe it will come in handy for those who are interested in finding some gems to read for the holidays.
Published on December 28, 2024 22:03
November 2, 2024
Understanding Multimodal LLMs
There has been a lot of new research on the multimodal LLM front, including the latest Llama 3.2 vision models, which employ diverse architectural strategies to integrate various data types like text and images. For instance, The decoder-only method uses a single stack of decoder blocks to process all modalities sequentially. On the other hand, cross-attention methods (for example, used in Llama 3.2) involve separate encoders for different modalities with a cross-attention layer that allows these encoders to interact. This article explains how these different types of multimodal LLMs function. Additionally, I will review and summarize roughly a dozen other recent multimodal papers and models published in recent weeks to compare their approaches.
Published on November 02, 2024 23:03
September 20, 2024
Building A GPT-Style LLM Classifier From Scratch
This article shows you how to transform pretrained large language models (LLMs) into strong text classifiers.��But why focus on classification? First, finetuning a pretrained model for classification offers a gentle yet effective introduction to model finetuning. Second, many real-world and business challenges revolve around text classification: spam detection, sentiment analysis, customer feedback categorization, topic labeling, and more.
Published on September 20, 2024 23:03
Sebastian Raschka's Blog
- Sebastian Raschka's profile
- 141 followers
Sebastian Raschka isn't a Å·±¦ÓéÀÖ Author
(yet),
but they
do have a blog,
so here are some recent posts imported from
their feed.
