日時
2025年11月10日(月)14:00 - 15:00 (JST)
講演者
  • 瀧 雅人 (立教大学 大学院人工知能科学研究科 准教授)
言語
英語
ホスト
Lingxiao Wang

Large language models such as ChatGPT are based on deep learning architectures known as Transformers. Owing to their remarkable performance and broad applicability, Transformers have become indispensable in modern AI development. However, it still remains an open question why Transformers perform so well and what the essential meaning of their unique structure is. One possible clue lies in the mathematical correspondence between Hopfield Networks and Transformers.

In this talk, I will first introduce the major developments over the past decade that have significantly increased the storage capacity of Hopfield Networks. I will then review the theoretical correspondence between Hopfield Networks and Transformers. Building on this background, I will present our recent findings: by extending this correspondence to include the hidden-state dynamics of Hopfield Networks, we discovered a new class of Transformers that can recursively propagate attention-score information across layers. Furthermore, we found, both theoretically and experimentally, that this new Transformer architecture resolves the “rank collapse” problem often observed in conventional multi-layer attention. As a result, when applied to language generation and image recognition tasks, it achieves performance surpassing that of existing Transformer-based models.

References

  1. Tsubasa Masumura, Masato Taki, On the Role of Hidden States of Modern Hopfield Network in Transformer, NeurIPS (2025)
  2. Hubert Ramsauer, etc., Hopfield Networks is All You Need, ICLR (2021), arXiv: 2008.02217
  3. Dmitry Krotov, John Hopfield, Large Associative Memory Problem in Neurobiology and Machine Learning, ICLR (2021), arXiv: 2008.06996

このイベントは研究者向けのクローズドイベントです。一般の方はご参加頂けません。メンバーや関係者以外の方で参加ご希望の方は、フォームよりお問い合わせ下さい。講演者やホストの意向により、ご参加頂けない場合もありますので、ご了承下さい。

このイベントについて問い合わせる