Deep Implicit Attention: A Mean-Field Theory Perspective on Attention Mechanisms
Can we model attention as the collective response of a statistical-mechanical system?
Can we model attention as the collective response of a statistical-mechanical system?