Neural networks simulations have always been a complex computational chal- lenge because of the requirements of large amount of computational and memory resources. Due to the nature of the problem, a high performance computing approach becomes vital, because the dynamics often involves the update of a large network for a large number of time steps. Moreover, the parameter space can be fairly large. An advanced optimization for the single time step is therefore necessary, as well as a strategy to explore the parameter space in an automatic fashion. This work rst examines the purely serial original code, identifying its bottlenecks and ine cient design choices. After that, several optimizations strategies are presented and discussed, exploiting vectorization, e cient mem- ory access and cache usage. The strategies are presented together with an extensive set of the benchmarks and a detailed discussion of all the issues encountered. The nal part of the work is the design of a high throughput approach to the paramenter sweep, necessary to explore the behaviour of the network. This is implemented by means of a task manager that takes care of running simulations from a batch of prede ned runs in an automatic way and collects their results. A detailed performance analysis of the task manager is reported. The results of the work show a consistent speed up for the single-run case, and a massive productivity improvement thanks to the task-manager. Moreover, the code base is now reorganized to favor extensibility and code reuse, allowing the application of several of the present strategies to other problems as well.
Performance-driven refactoring of Potts associative memory network model
-
2015-12-18
Abstract
Neural networks simulations have always been a complex computational chal- lenge because of the requirements of large amount of computational and memory resources. Due to the nature of the problem, a high performance computing approach becomes vital, because the dynamics often involves the update of a large network for a large number of time steps. Moreover, the parameter space can be fairly large. An advanced optimization for the single time step is therefore necessary, as well as a strategy to explore the parameter space in an automatic fashion. This work rst examines the purely serial original code, identifying its bottlenecks and ine cient design choices. After that, several optimizations strategies are presented and discussed, exploiting vectorization, e cient mem- ory access and cache usage. The strategies are presented together with an extensive set of the benchmarks and a detailed discussion of all the issues encountered. The nal part of the work is the design of a high throughput approach to the paramenter sweep, necessary to explore the behaviour of the network. This is implemented by means of a task manager that takes care of running simulations from a batch of prede ned runs in an automatic way and collects their results. A detailed performance analysis of the task manager is reported. The results of the work show a consistent speed up for the single-run case, and a massive productivity improvement thanks to the task-manager. Moreover, the code base is now reorganized to favor extensibility and code reuse, allowing the application of several of the present strategies to other problems as well.File | Dimensione | Formato | |
---|---|---|---|
1963_35158_Romor_tesi.pdf
accesso aperto
Tipologia:
Tesi
Licenza:
Non specificato
Dimensione
717.37 kB
Formato
Adobe PDF
|
717.37 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.