SISSA DIGITAL LIBRARYInstitutional Research Information System (Statistiche: prodotti, OA)
Per informazioni contatta sdl@sissa.it

Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds.

Intrinsic dimension of data representations in deep neural networks / Ansuini, A.; Laio, A.; Macke, J. H.; Zoccolan, D.. - 32:(2019). (Intervento presentato al convegno 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 tenutosi a Vancouver nel 2019).

Intrinsic dimension of data representations in deep neural networks

Ansuini A.;Laio A.;Macke J. H.;Zoccolan D.

2019-01-01

Abstract

Deep neural networks progressively transform their inputs across multiple processing layers. What are the geometrical properties of the representations learned by these networks? Here we study the intrinsic dimensionality (ID) of data-representations, i.e. the minimal number of parameters needed to describe a representation. We find that, in a trained network, the ID is orders of magnitude smaller than the number of units in each layer. Across layers, the ID first increases and then progressively decreases in the final layers. Remarkably, the ID of the last hidden layer predicts classification accuracy on the test set. These results can neither be found by linear dimensionality estimates (e.g., with principal component analysis), nor in representations that had been artificially linearized. They are neither found in untrained networks, nor in networks that are trained on randomized labels. This suggests that neural networks that can generalize are those that transform the data into low-dimensional, but not necessarily flat manifolds.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Titolo del volume
	
				Advances in Neural Information Processing Systems
			
	Serie
	
				ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
			
	Numero del volume
	
				32
			
	Nome editore
	
				Neural information processing systems foundation
			
	Tutti gli autori
	
						Ansuini, A.; Laio, A.; Macke, J. H.; Zoccolan, D.
					
	Appare nelle tipologie:
	
				4.1 Contribution in Conference proceedings

File in questo prodotto:

File	Dimensione	Formato
Ansuini et al 2019 Neurips.pdf accesso aperto Descrizione: Articolo principale Tipologia: Versione Editoriale (PDF) Licenza: Non specificato Dimensione 1.46 MB Formato Adobe PDF Visualizza/Apri	1.46 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/116573

Citazioni

ND

153

87

social impact