Transformers Are Secretly Collectives of Spin Systems
A statistical mechanics perspective on transformers
Transformers from Spin Models: Approximate Free Energy Minimization
How far can we push the idea of transformers as physical systems?
Deep Implicit Attention: A Mean-Field Theory Perspective on Attention Mechanisms
Can we model attention as the collective response of a statistical-mechanical system?