OFTNAI

Physical variability of speech combined with its perceptual constancy make speech recognition a challenging task. The human auditory brain, however, is able to perform speech recognition effortlessly. How does the auditory brain learn to robustly recognise auditory objects, such as naturally spoken words, in a transform invariant manner even despite the huge variability within the raw auditory wave inputs? What are the areas within the extensive auditory brain hierarchy that are important for this task? What is the simplest neural code sufficient to represent the learnt auditory objects within the output layers of the auditory brain hierarchy? Does the brain use rate or temporal encoding to represent auditory objects? These are some of the questions that we are trying to address.

Neurophysiological studies have provided insights into the architecture and response properties of different areas within the auditory brain hierarchy, however the precise computational mechanisms used to learn stimulus specific transform invariant representations of auditory objects, such as phonemes or words, are currently unknown. In order to understand these computational mechanisms, we have developed an unsupervised spiking neural network model grounded in the known neurophysiology of the auditory brain. This model can be used to make neurophysiologically testable hypotheses about the mechanisms used by the brain to perform auditory object recognition. We are working closely with the Oxford Auditory Neuroscience Group to test the hypotheses generated by the model. Furthermore, the model can be used as a prototype for developing a novel approach to automatic speech recognition (ASR), which, due to its grounding in the neurophysiology of the auditory brain, should be able to cope with the robustness problems that current modern ASR systems are struggling to solve, such as speaker variability and speech recognition in noise among others.

Audition PublicationsDateAuthorsJournalVolume/PagesDownload
Harmonic Training and the formation of pitch representation in a neural network model of the auditory brain2016Ahmad, N., Higgins, I., Walker, K.M.M. and Stringer, S.M.Frontiers in Computational NeuroscienceVol 10, Article 24 Download