-
Shazeer Typing
-
π SAEBench: A Comprehensive Benchmark for Sparse Autoencoders
-
π Standard SAEs Might Be Incoherent: A Choosing Problem & A βConciseβ Solution
-
π MDL-SAEs: Interpretability as Compression
-
Mamba Explained
-
The Impact of Mixtral
-
Descriptive Matrix Operations with Einops
-
Dictionary Learning with Sparse AutoEncoders
-
An Analogy for Understanding Mixture of Expert Models
-
From Sparse To Soft Mixtures of Experts
-
DeepSpeed's Bag of Tricks for Speed & Scale