Hello! 👋

I’m Kola, an ML Researcher based in London, UK. This is my technical blog about Machine Learning.

My current main research interests are:

  1. Mechanistic Interpretability - broadly defined as the study of reverse engineering neural networks from the learned weights into human-interpretable algorithms.
    • I’m particularly interested in universal representation learning (and its philosophical implications), compositional representations and principled approaches to feature disentanglement.
  2. Theories of Agency
    • I’m particularly thinking about Active Inference and Bayesian Mechanics as candidates for unified theories of agency which we can use as models for agency at different levels of abstraction.
    • I’m especially interested in how we can use these theories to understand multi-agent systems.
    • I’m also interested in evaluations for dangerous capabilities in agentic systems, in particular in the context of cybersecurity as one of the first applications where our theory may apply.
  3. Adaptive Neural Computation
    • I’m especially interested in approaches which allow networks to spend more compute on difficult tokens via early-exiting mechanisms, MoE and related approaches.
    • I maintain an annotated collection of research papers in Adaptive Computation for the community.
  4. Modularity in Neural Networks
    • I’m interested in how we can understand the inherent modularity in neural networks in the wild.
    • I’m also interested in lessons about multi-task learning and generalisation from designed modular architectures.

Previous research interests have included the Linguistic properties of Mathematics, ML applied to Musicology and Logic.

Find me on Substack for other writing or on GitHub for code. You can find my publications and pre-prints on Google Scholar.

Feel free to reach out by email, give me anonymous feedback here or schedule a chat with me about the topics above here.