Partition Function

Transformers Are Secretly Collectives of Spin Systems

A statistical mechanics perspective on transformers

Transformers from Spin Models: Approximate Free Energy Minimization

How far can we push the idea of transformers as physical systems?