SISSA DIGITAL LIBRARYInstitutional Research Information System (Statistiche: prodotti, OA)
Per informazioni contatta sdl@sissa.it

The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences-the stochastic switching-Ornstein-Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.

The impact of memory on learning sequence-to-sequence tasks / Seif, Alireza; Loos, Sarah A M; Tucci, Gennaro; Roldán, Édgar; Goldt, Sebastian. - In: MACHINE LEARNING: SCIENCE AND TECHNOLOGY. - ISSN 2632-2153. - 5:1(2024), pp. 1-16. [10.1088/2632-2153/ad2feb]

The impact of memory on learning sequence-to-sequence tasks

Seif, Alireza;Loos, Sarah A M;Tucci, Gennaro;Roldán, Édgar;Goldt, Sebastian

2024-01-01

Abstract

The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences-the stochastic switching-Ornstein-Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				MACHINE LEARNING: SCIENCE AND TECHNOLOGY
			
	Numero del volume
	
				5
			
	Fascicolo
	
				1
			
	Da pagina
	
				1
			
	A pagina
	
				16
			
	Numero di articolo
	
				015053
			
	Codice DOI
	
				https://dx.doi.org/10.1088/2632-2153/ad2feb
			
	Fulltext via DOI
	
				https://doi.org/10.1088/2632-2153/ad2feb
			
	Tutti gli autori
	
						Seif, Alireza; Loos, Sarah A M; Tucci, Gennaro; Roldán, Édgar; Goldt, Sebastian
					
	Appare nelle tipologie:
	
				1.1 Journal article

File in questo prodotto:

File	Dimensione	Formato
Seif_2024_Mach._Learn.__Sci._Technol._5_015053.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.13 MB Formato Adobe PDF Visualizza/Apri	1.13 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/143212

Citazioni

ND

3

2

social impact