SISSA DIGITAL LIBRARYInstitutional Research Information System (Statistiche: prodotti, OA)
Per informazioni contatta sdl@sissa.it

In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts in identifying spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure for evaluating loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds for the complexity of their respective loss functions and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an ℓ2 regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.

A topological description of loss surfaces based on Betti Numbers / Bucarelli Maria, Sofia; D'Inverno, Giuseppe Alessio; Bianchini, Monica; Scarselli, Franco; Silvestri, Fabrizio. - In: NEURAL NETWORKS. - ISSN 0893-6080. - 178:(2024). [10.1016/j.neunet.2024.106465]

A topological description of loss surfaces based on Betti Numbers

Bucarelli Maria Sofia;D'Inverno Giuseppe Alessio;Bianchini Monica;Scarselli Franco;Silvestri Fabrizio

2024-01-01

Abstract

In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts in identifying spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure for evaluating loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds for the complexity of their respective loss functions and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an ℓ2 regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				NEURAL NETWORKS
			
	Numero del volume
	
				178
			
	Numero di articolo
	
				106465
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.neunet.2024.106465
			
	Fulltext via DOI
	
				10.1016/j.neunet.2024.106465
			
	URL
	
				https://www.sciencedirect.com/science/article/pii/S0893608024003897
https://arxiv.org/abs/2401.03824
			
	Tutti gli autori
	
						Bucarelli Maria, Sofia; D'Inverno, Giuseppe Alessio; Bianchini, Monica; Scarselli, Franco; Silvestri, Fabrizio
					
	Appare nelle tipologie:
	
				1.1 Journal article

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0893608024003897-main.pdf accesso aperto Descrizione: pdf editoriale Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 622.57 kB Formato Adobe PDF Visualizza/Apri	622.57 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/143310

Citazioni

ND

2

1

social impact