Mean-Field Theory

Spin-Model Transformers

A non-equilibrium statistical mechanics perspective on transformers

Deep Implicit Attention: A Mean-Field Theory Perspective on Attention Mechanisms

Can we model attention as the collective response of a statistical-mechanical system?