Shape Processing Strategies Underlying Visual Object Recognition

Large variations in object appearance such as changes in pose or size, do not affect much recognition of the objects in human and some non-human species, yet the computational mechanisms underlying such an invariant shape processing are not full understood. Most studies aimed at understanding higher-level vision and shape/object-processing use monkeys to investigate the neural underpinning of these phenomena, in spite of the limited range of experimental approaches that are available in this species. Rodents, on the other hand, are the animals model-of-choice in many neuroscience sub-disciplines, because of the growing number of powerful experimental tools that have become available in the recent years. However, the evidence for invariant object recognition, and, more in general, higher-level visual processing in rodents, is still limited and under debate. In this thesis, in order to show how the complexity of visual shape processing in rats compared with that of human, we uncovered the perceptual strategies underlying invariant visual object recognition task in these two species. To characterize the visual recognition strategies, we applied an image masking method, known as the Bubbles method, that revealed the diagnostic features used by the observers to discriminate two objects across a range of sizes, positions, in-depth and in-plane rotations. An ideal observer analysis was also carried out in order to compare the diagnostic features obtained for humans and rats with those obtained for a simulated observer that has full access to the pictorial discriminatory information contained in the different views of the two target objects. We also investigated to what extent, for both rats and humans, the diagnostic object features were preserved across views, thus touching on a long-standing debate between view-dependent and view-invariant models of object recognition. Based on the diagnostic features obtained using the standard analysis of the Bubbles method, we found that rat recognition relied on combinations of multiple features that were mostly preserved across the transformations the objects underwent, and largely overlapped with the features that a simulated ideal observer deemed optimal to accomplish the discrimination task. These results indicate that rats are able to process and efficiently use shape information, in a way that is largely tolerant to variation in object appearance. The diagnostic features found for humans partially overlapped with those found for rats for one object. However, human and rat recognition strategies substantially differed in the case of the other object. Moreover, human recognition strategy was not correlated with that of the ideal observer, but, rather, for many object views, was significantly anti-correlated. This can be attributed to the fact that human observers relied on the boundaries and coterminations of the objects as diagnostic features, while rats and the ideal observer mainly relied the bulk of the objects’ structural parts. Finally, human strategy remained largely stable across transformations, with the exception of extreme rotations in-depth---in that case, due to drastic changes in the appearance of the main objects’ diagnostic features, the observers changed their perceptual strategy to recognize the object. The standard analysis for finding diagnostic shape features using the Bubbles method is based on the assumption of an underlying linear observer model---i.e., an observer that performs a weighted sum of the evidences provided, independently, by each pixel in an image to detect the presence of a given object in that image. One problem with this assumption is that it does not allow understanding how multiple, distinct object features may interact to drive the recognition behavior of an observer---i.e., whether such features need to be simultaneously present/visible for the observer to correctly identify the object. As a computational contribution of this thesis, we present a novel, information-theoretic analysis to uncover non-linear interactions between diagnostic features extracted with the Bubbles method. After formulating the problem in a mathematical framework, we carried out two simulations of Bubbles experiments, in which the simulated observers performed either an AND-like or an OR-like interaction between two predefined features of an object in order to recognize it. These simulations showed that our information-theoretic analysis successfully retrieved the simulated non-linear interaction between pairs of diagnostic object features. To summarize, our experimental results provide the most compelling evidence, to date, that rats process visual objects through rather sophisticated, shape-based, transformation-tolerant mechanisms. As such, given the powerful array of experimental approaches that are available in rats, this model system will likely become a valuable tool in the study of the neuronal mechanisms underlying object vision. We also concluded that human observers generalize their recognition to novel views of visual objects largely by relying on the same patterns of diagnostic features that they used to discriminate previously learned views of the same objects. Our results are consistent with both view-based and view-invariant theories of object recognition, because we found the same diagnostic features to be preserved in most object views, but we also observed the emergence of new features for particularly challenging views. Noticeably, our novel feature-interaction, information-theoretic analysis has the potential to test these theories even further by measuring whether the relationship between diagnostic features will be preserved across transformations. Finally, by uncovering human invariant recognition strategy under a variety of viewing conditions, our work provides new constraints for visual cortex models and neuromorphic machine vision systems, in terms of human-like shape processing mechanisms.

Shape Processing Strategies Underlying Visual Object Recognition

-

2013-01-31

Abstract

Large variations in object appearance such as changes in pose or size, do not affect much recognition of the objects in human and some non-human species, yet the computational mechanisms underlying such an invariant shape processing are not full understood. Most studies aimed at understanding higher-level vision and shape/object-processing use monkeys to investigate the neural underpinning of these phenomena, in spite of the limited range of experimental approaches that are available in this species. Rodents, on the other hand, are the animals model-of-choice in many neuroscience sub-disciplines, because of the growing number of powerful experimental tools that have become available in the recent years. However, the evidence for invariant object recognition, and, more in general, higher-level visual processing in rodents, is still limited and under debate. In this thesis, in order to show how the complexity of visual shape processing in rats compared with that of human, we uncovered the perceptual strategies underlying invariant visual object recognition task in these two species. To characterize the visual recognition strategies, we applied an image masking method, known as the Bubbles method, that revealed the diagnostic features used by the observers to discriminate two objects across a range of sizes, positions, in-depth and in-plane rotations. An ideal observer analysis was also carried out in order to compare the diagnostic features obtained for humans and rats with those obtained for a simulated observer that has full access to the pictorial discriminatory information contained in the different views of the two target objects. We also investigated to what extent, for both rats and humans, the diagnostic object features were preserved across views, thus touching on a long-standing debate between view-dependent and view-invariant models of object recognition. Based on the diagnostic features obtained using the standard analysis of the Bubbles method, we found that rat recognition relied on combinations of multiple features that were mostly preserved across the transformations the objects underwent, and largely overlapped with the features that a simulated ideal observer deemed optimal to accomplish the discrimination task. These results indicate that rats are able to process and efficiently use shape information, in a way that is largely tolerant to variation in object appearance. The diagnostic features found for humans partially overlapped with those found for rats for one object. However, human and rat recognition strategies substantially differed in the case of the other object. Moreover, human recognition strategy was not correlated with that of the ideal observer, but, rather, for many object views, was significantly anti-correlated. This can be attributed to the fact that human observers relied on the boundaries and coterminations of the objects as diagnostic features, while rats and the ideal observer mainly relied the bulk of the objects’ structural parts. Finally, human strategy remained largely stable across transformations, with the exception of extreme rotations in-depth---in that case, due to drastic changes in the appearance of the main objects’ diagnostic features, the observers changed their perceptual strategy to recognize the object. The standard analysis for finding diagnostic shape features using the Bubbles method is based on the assumption of an underlying linear observer model---i.e., an observer that performs a weighted sum of the evidences provided, independently, by each pixel in an image to detect the presence of a given object in that image. One problem with this assumption is that it does not allow understanding how multiple, distinct object features may interact to drive the recognition behavior of an observer---i.e., whether such features need to be simultaneously present/visible for the observer to correctly identify the object. As a computational contribution of this thesis, we present a novel, information-theoretic analysis to uncover non-linear interactions between diagnostic features extracted with the Bubbles method. After formulating the problem in a mathematical framework, we carried out two simulations of Bubbles experiments, in which the simulated observers performed either an AND-like or an OR-like interaction between two predefined features of an object in order to recognize it. These simulations showed that our information-theoretic analysis successfully retrieved the simulated non-linear interaction between pairs of diagnostic object features. To summarize, our experimental results provide the most compelling evidence, to date, that rats process visual objects through rather sophisticated, shape-based, transformation-tolerant mechanisms. As such, given the powerful array of experimental approaches that are available in rats, this model system will likely become a valuable tool in the study of the neuronal mechanisms underlying object vision. We also concluded that human observers generalize their recognition to novel views of visual objects largely by relying on the same patterns of diagnostic features that they used to discriminate previously learned views of the same objects. Our results are consistent with both view-based and view-invariant theories of object recognition, because we found the same diagnostic features to be preserved in most object views, but we also observed the emergence of new features for particularly challenging views. Noticeably, our novel feature-interaction, information-theoretic analysis has the potential to test these theories even further by measuring whether the relationship between diagnostic features will be preserved across transformations. Finally, by uncovering human invariant recognition strategy under a variety of viewing conditions, our work provides new constraints for visual cortex models and neuromorphic machine vision systems, in terms of human-like shape processing mechanisms.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di discussione
	
				31-gen-2013
			
	Autore non riconosciuto
	
				Alemi-Neissi, Alireza
			
	Tutti gli autori
	
	Appare nelle tipologie:
	
				8.1 PhD thesis

File in questo prodotto:

File	Dimensione	Formato
1963_6435_Thesis_library.pdf Open Access dal 01/02/2016 Tipologia: Tesi Licenza: Non specificato Dimensione 9.87 MB Formato Adobe PDF Visualizza/Apri	9.87 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/4173

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

SISSA DIGITAL LIBRARYInstitutional Research Information System (Statistiche: prodotti, OA)
Per informazioni contatta sdl@sissa.it