State of RL for reasoning LLMs March 15, 2026 · 26 min read Evolution of reinforcement learning for reasoning LLMs