SISSA DIGITAL LIBRARYInstitutional Research Information System (Statistiche: prodotti, OA)
Per informazioni contatta sdl@sissa.it

Continual learning-learning new tasks in sequence while maintaining performance on old tasks-remains particularly challenging for artificial neural networks. Surprisingly, the amount of forgetting does not increase with the dissimilarity between the learned tasks, but appears to be worst in an intermediate similarity regime. In this paper we theoretically analyse both a synthetic teacher-student framework and a real data setup to provide an explanation of this phenomenon that we name Maslow's hammer hypothesis. Our analysis reveals the presence of a trade-off between node activation and node re-use that results in worst forgetting in the intermediate regime. Using this understanding we reinterpret popular algorithmic interventions for catastrophic interference in terms of this trade-off, and identify the regimes in which they are most effective.

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation / Lee, S.; Mannelli, S. S.; Clopath, C.; Goldt, S.; Saxe, A.. - 162:(2022), pp. 12455-12477. (Intervento presentato al convegno International Conference on Machine Learning, 17-23 July 2022, Baltimore, Maryland, USA tenutosi a Baltimre, Maryland, USA nel 17-23 July 2022).

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

Lee S.;Mannelli S. S.;Clopath C.;Goldt S.;Saxe A.

2022-01-01

Abstract

Continual learning-learning new tasks in sequence while maintaining performance on old tasks-remains particularly challenging for artificial neural networks. Surprisingly, the amount of forgetting does not increase with the dissimilarity between the learned tasks, but appears to be worst in an intermediate similarity regime. In this paper we theoretically analyse both a synthetic teacher-student framework and a real data setup to provide an explanation of this phenomenon that we name Maslow's hammer hypothesis. Our analysis reveals the presence of a trade-off between node activation and node re-use that results in worst forgetting in the intermediate regime. Using this understanding we reinterpret popular algorithmic interventions for catastrophic interference in terms of this trade-off, and identify the regimes in which they are most effective.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Titolo del volume
	
				Proceedings of Machine Learning Research
			
	Serie
	
				PROCEEDINGS OF MACHINE LEARNING RESEARCH
			
	Numero del volume
	
				162
			
	Da pagina
	
				12455
			
	A pagina
	
				12477
			
	URL
	
				https://arxiv.org/abs/2205.09029
			
	Nome editore
	
				ML Research Press
			
	Tutti gli autori
	
						Lee, S.; Mannelli, S. S.; Clopath, C.; Goldt, S.; Saxe, A.
					
	Appare nelle tipologie:
	
				4.1 Contribution in Conference proceedings

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/135773

Citazioni

ND

3

0

social impact