• Home
  • World Model

    June 2025

  • Robotic Control via Embodied Chain-of-Thought Reaosning

    April 2025

  • Swarm of Attention Variants

    April 2025

  • RoPE

    April 2025

  • Re-Mix: Optimizing Data Mixture for Large Scale Imitation Learning

    April 2025

  • (Q?)KV Cache

    April 2025

  • Tiny Stories

    April 2025

  • VolGAN

    March 2025

  • Generative Adverserial Networks

    March 2025

  • Understanding the Implied Volatility Surface

    March 2025

  • GRPO

    February 2025

  • The Inner Mechanism of Byte Pair Encoding

    January 2025

  • Improving Language Understandingby Generative Pre-Training

    January 2025

  • Transformers

    January 2025

  • Attention

    January 2025

  • Neural Scaling Laws

    January 2025

  • Mathematics of BPTT

    January 2025

  • Residuals

    November 2024

  • Backprop through Convolutions

    October 2024