The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a ``simplicity bias'': neural networks prevent overfitting by initially learning simple classifiers before progressing to more complex, non-linear functions. While simplicity biases have been described theoretically and experimentally in feed-forward networks for supervised learning, the extent to which they also explain the remarkable success of transformers trained with self-supervised techniques remains unclear. In our study, we demonstrate that transformers, trained on natural language data, also display a simplicity bias. Specifically, they sequentially learn many-body interactions among input tokens, reaching a saturation point in the prediction error for low-degree interactions while continuing to learn high-degree interactions. To conduct this analysis, we develop a procedure to generate \textit{clones} of a given natural language data set, which rigorously capture the interactions between tokens up to a specified order. This approach opens up the possibilities of studying how interactions of different orders in the data affect learning, in natural language processing and beyond.

A distributional simplicity bias in the learning dynamics of transformers / Rende, R.; Gerace, F.; Laio, A.; Goldt, S.. - 37:(2024). (Intervento presentato al convegno The Thirty-eighth Annual Conference on Neural Information Processing Systems tenutosi a Vancouver, Canada nel 16 December 2024).

A distributional simplicity bias in the learning dynamics of transformers

Rende,R.;Gerace, F.;Laio, A.;Goldt, S.
2024-01-01

Abstract

The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a ``simplicity bias'': neural networks prevent overfitting by initially learning simple classifiers before progressing to more complex, non-linear functions. While simplicity biases have been described theoretically and experimentally in feed-forward networks for supervised learning, the extent to which they also explain the remarkable success of transformers trained with self-supervised techniques remains unclear. In our study, we demonstrate that transformers, trained on natural language data, also display a simplicity bias. Specifically, they sequentially learn many-body interactions among input tokens, reaching a saturation point in the prediction error for low-degree interactions while continuing to learn high-degree interactions. To conduct this analysis, we develop a procedure to generate \textit{clones} of a given natural language data set, which rigorously capture the interactions between tokens up to a specified order. This approach opens up the possibilities of studying how interactions of different orders in the data affect learning, in natural language processing and beyond.
2024
Advances in Neural Information Processing 37
37
https://papers.nips.cc/paper_files/paper/2024/hash/ae6c81a39079ddeb88b034b6ef18c7fe-Abstract-Conference.html
https://openreview.net/forum?id=GgV6UczIWM
Rende, R.; Gerace, F.; Laio, A.; Goldt, S.
File in questo prodotto:
File Dimensione Formato  
15494_A_distributional_simplic.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Non specificato
Dimensione 972.01 kB
Formato Adobe PDF
972.01 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/143213
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact