Ben's headshot

Hi, I'm Ben Hoover

I'm an AI Researcher studying memory

Understanding AI foundation models from the perspective of large Associative Memories.

I am a Machine Learning PhD student at Georgia Tech advised by Polo Chau and an AI Research Engineer with IBM Research. My research focuses on building more interpretable and parameter efficient AI by rethinking the way we train and build deep models, taking inspiration from Associative Memories and Hopfield Networks. I like to visualize what happens inside AI models.

News

July 2025
πŸ“’ Diffusion Explainer is highlighted by Quanta Magazine's "How Physics Made Modern AI Possible"
July 2025
✈️ I will be at ICML in Vancouver presenting our Associative Memory Tutorial and ConceptAttention (oral πŸ†). Feel free to reach out!
June 2025
πŸ“’ I got to chime in on Quanta's article on creativity, memory and diffusion models. Always happy to talk about these cool ideas!
June 2025
πŸŽ–οΈ IBM's coverage of Dmitry Krotov's scientific ambitions made the front page of hacker news, go Dima!
June 2025
✈️ I will be at CVPR in Nashville (June 12-16) presenting ConceptAttention and 3D Gaussian Splat Vulnerabilities"to the VisCon Workshop and NeuralBCC Workshop, respectively. Feel free to reach out!
June 2025
πŸ“– Quanta quoted me in an article on AI creativity regarding my work on Memory and Diffusion πŸ€—.
See more...

Research Highlights

Thumbnail for Tutorial on Associative Memory

Tutorial on Associative Memory

Energy-based Associative Memory transformed the field of AI, but it is hardly understood. We present a birds eye view of Associative Memory, beginning with the invention of the Hopfield Network and concluding with modern, dense storage versions that have strong connections to Transformers and kernel methods.
Thumbnail for ConceptAttention

ConceptAttention

We discover how to extract highly salient features from the learned representations of Diffusion Transformers, using our technique to segment images by semantic concept. We name our technique ConceptAttention, which outperforms all prior methods on single- and multi-class classification.
Thumbnail for DenseAMs meet Random Features

DenseAMs meet Random Features

DenseAMs can store an exponential number of memories, but each memory adds new parameters to the energy function. We propose a novel, "Distributed representation for DenseAMs" (DrDAM) that allows us to add new memories without increasing the total number of parameters.
Thumbnail for Transformer Explainer

Transformer Explainer

Transformers are the most powerful AI innovation of the last decade. Learn how they work by interacting with every mechanic from the comfort of your web browser. Taught in Georgia Tech CSE6242 Data and Visual Analytics (typically 250-300 students per semester).
Thumbnail for Memory in Plain Sight

Memory in Plain Sight

We are the first work to discover that diffusion models perform memory retrieval in their denoising dynamics.
Thumbnail for Energy Transformer

Energy Transformer

We derive an Associative Memory inspired by the famous Transformer architecture, where the forward pass through the model is memory retrieval by energy descent.
Thumbnail for Diffusion Explainer

Diffusion Explainer

Diffusion models are complicated. We break down Stable Diffusion and explain each component of the model visually. Taught in Georgia Tech CSE6242 Data and Visual Analytics (typically 250-300 students per semester).
Thumbnail for HAMUX

HAMUX

We invent a software abstraction around "synapses" and "neurons" to assemble energy functions of complicated Associative Memories, where memory retrieval is performed through autograd.
Thumbnail for RXNMapper

RXNMapper

We discover that Transformers trained on chemical reactions learn, on their own, how atoms physically rearrange.