Statistical Physics of In-Context Learning in Transformer
- Date
- September 16 (Tue) 15:00 - 16:30, 2025 (JST)
- Speaker
-
- Haiping Huang (Professor, School of Physics, Sun Yat-sen University, China)
- Venue
- via Zoom
- Language
- English
- Host
- Lingxiao Wang
The pre-trained large model demonstrates the ability to learn from examples, that is, it can infer patterns and generalize from a small number of examples without retraining. How does this ability emerge? This report proposes a physical model mapping of the large model pre-training process, and finds that the training process corresponds to spin condensation, the unique energy ground state will determine the example generalization ability, and the diversity of training data is a key element in algorithm design. This study also reveals that the reasoning process of the large model may be fundamentally different from human thinking.
References
- Yuhao Li, Ruoran Bai, Haiping Huang, Spin glass model of in-context learning, Phys. Rev. E 112, L013301 (2025), doi: 10.1103/5l5m-4nk5, arXiv: 2408.02288
- Haiping Huang, Statistical Mechanics of Neural Networks, (2022), doi: 10.1007/978-981-16-7570-6
This is a closed event for scientists. Non-scientists are not allowed to attend. If you are not a member or related person and would like to attend, please contact us using the inquiry form. Please note that the event organizer or speaker must authorize your request to attend.