Current research in the lab covers three main areas:
- Basic research into the neural mechanisms underlying object recognition, attention, and high-level plasticity, employing a combination of systems-level modeling, fMRI, EEG, and behavioral techniques;
- Biomedical applications, where the goal is to uncover the neural bases of behavioral differences (in particular involving object recognition) in mental disorders;
- Augmented Cognition, focusing on enhancing human cognitive abilities through hybrid brain-computer systems that leverage our mechanistic understanding of the neural bases of cognitive processes to fully exploit the brain's information processing potential.
The unifying element between the different research projects is that they are all based on the same mechanistic and quantitative systems-level model of object recognition in cortex, allowing us to make optimal use of potential synergies. The model not only guides the experiments but also in turn evolves as an effect of the experimental results. More information on the model, including source code and relevant publications, can be found on the model homepage.
Georgetown University recently featured the research done in the lab.
Object recognition is a fundamental cognitive task that is performed effortlessly countless times every day, such as when gauging a conversation partner's facial expression, looking for a friend's face in a crowd, or reading the words on this web page. All these tasks depend on the visual system's ability to recognize specific objects, despite significant variations in their appearance due to changes in lighting, position, viewpoint, or the simultaneous presence of other objects. Importantly, the visual system is not hard-wired but can be trained for specific tasks, e.g., detecting missile launchers in satellite images or tumors in X-ray films. Despite the apparent ease with which we see, visual recognition is widely acknowledged to be a very difficult computational problem. From a biological systems perspective, visual recognition involves several levels of understanding, from the computational level, to the levels of cellular and biophysical mechanisms and the level of neuronal circuits, up to the level of behavior. Computational approaches to visual recognition are becoming increasingly important to integrate data from different experiments and levels of description (such as electrophysiology, brain imaging, and behavior) into one coherent, quantitative framework that can then be used to provide rigorous hypotheses for further experiments.
We are now applying our model of object recognition in cortex to study how visual experience and training on specific tasks shape the brain's representation of the external world and its object recognition capabilities, and how the visual system can successfully recognize objects, even in the presence of interfering stimuli. In particular, the model is being used to provide detailed hypotheses on how training on specific object recognition tasks (ranging from the discrimination of novel stimuli to categorization and object recognition in visual clutter) can modify processing at different levels of the visual system, and how these changes are related to improvements in behavioral performance. This leads to a set of hypotheses that are tested with human volunteers in a series of behavioral, brain imaging (fMRI), and electroencephalography (EEG) experiments, using the same stimuli and tasks as in the simulations. Importantly, simulations and experiments are tightly integrated so that experimental results from simpler tasks can be used to refine the model, which can then be used to provide more specific hypotheses for more complex tasks. The results of this research will be relevant for the design of machine vision systems in artificial intelligence that better mimic how humans see (for recent results, see this paper in IEEE PAMI), for the development of human-machine interfaces that optimally leverage the brain's ability to process visual information, and for applications involving human training on object recognition tasks ranging from baggage screening to satellite image analysis.
In a new line of research, we are exploring the neural mechanisms underlying human auditory object recognition. In particular, given the similarity in computational requirements as well as supporting experimental data, we are exploring the hypothesis that the insights gained in understanding the neural bases of visual object recognition can be leveraged for the study of auditory object recognition.
The integrative model-based approach is not only of interest for basic research, but also for biomedical applications, in particular in the study of mental disorders where the mechanistic link between behavioral differences and differences at the neural level is often poorly understood. Correspondingly, a second area of emphasis in the lab is the investigation of the neural bases of object recognition differences in mental disorders such as autism as a way to understand differences in systems-level neural processing in individuals with mental disorders. For example, many studies have shown that individuals with Autism Spectrum Disorder (ASD) differ in their face perception abilities from control subjects. While our understanding of the network of brain areas underlying human face perception has advanced considerably, conventional models are still largely qualitative and treat processing within specific brain areas as "black boxes". To understand the neural causes of the varied effects on object recognition capabilities associated with many neurological disorders, and to identify targets for therapeutic intervention, a quantitative and mechanistic model of the neural computations underlying object recognition is required that rigorously links visual stimuli to neural activation and behavior. Funded by an R01 from NIMH, we are currently developing this novel approach (in collaboration with researchers at Children's National Medical Center in Washington, DC, at Massachusetts General Hospital, and at the University of British Columbia) to build a computational model of face processing to compare processing in control subjects and in individuals with autism at the level of the neurocomputational mechanisms. We have recently shown (Jiang et al., Neuron, 2006) the promise of this approach by using the model to quantitatively link neuronal tuning, fMRI, and face discrimination behavior for control subjects. This was an especially interesting result, as the neural mechanisms underlying the representation and recognition of faces in the human brain have been the subject of much controversy. While neurophysiology experiments had provided evidence for a "simple-to-complex" processing model based on a hierarchy of increasingly complex image features, behavioral and fMRI data had been interpreted to suggest that that model is insufficient, and that face discrimination requires additional, face-specific neural mechanisms. Our study showed that a "simple-to-complex" model can in fact quantitatively account for both human face discrimination performance and functional imaging data, challenging the accepted wisdom regarding special facial representation strategies in the brain and suggesting that the same neural mechanisms might underlie the recognition of different object classes, with important implications for the possible remediation of object recognition deficits.
We have been involved in a number of DARPA- and DOD-sponsored efforts to translate our model-based understanding of human object recognition to develop hybrid brain-machine object detection systems. Specifically, we are developing presentation and task paradigms as well as single-trial neural data analysis techniques to optimize neural signals (measured via EEG) and image throughput for rapid neurally-based object detection detection. We are employing a combination of prior information regarding useful features based on our model of object recognition together with automatic feature selection and information retrieval techniques based on linear algebra. For instance, we can currently obtain classifier detection ROC areas of 0.92 for a single target embedded in streams presented at presentation rates of 12 images/second (see Figure). In addition we can achieve ROC areas of 0.98 when testing our classifier on the subset of trials that the subject correctly answered. Visual paradigm development is directed at designing paradigms to optimally exploit the brain's fast temporal and parallel processing capabilities untapped by current visual presentation paradigms. A new exploratory focus of our work is to apply similar techniques to accelerate the learning of visual tasks. We are further making use of the temporal resolution of the EEG data to extend our model of object recognition to the temporal domain and to study the neural mechanisms that translate the sensory input into a behavioral decision.