Computer modellers within the university research centre are exploring various aspects of visual processing in the brain, including motion detection, face recognition in natural scenes, and the recognition of objects from novel views. Over successive stages, the primate visual system develops neurons that respond with view, size and position invariance to objects or faces. Our models explain how such neurons may develop their firing properties, and hence allow the visual system to recognise objects in natural environments.
This research has direct bearing on understanding disorders of visual perception such as amblyopia, in which one eye suffers reduced vision due to interference during early visual development. Amblyopia is the leading cause of vision loss in persons under 40 years of age. Other disabilities include prosopagnosia where subjects have difficulty recognising faces, or spatial neglect where patients ignore part of their field of vision.
Recently, our computer simulations have revealed a powerful new algorithm, Continuous Transformation Learning, that may account for how the brain learns to recognise objects and faces from different viewpoints. This discovery represents a major breakthrough in understanding the operation of the visual system, and should help to guide the treatment of visual disorders arising from developmental problems.
In addition to potential medical benefits, possible engineering applications of this research range from visual control and quality inspection in manufacturing to automated CCTV monitoring.The new Continuous Transformation Learning algorithm may help robots to operate more flexibly in real-world environments by enabling them to recognise objects from different viewpoints.
Case study Recognising objects from novel views
One of the major challenges in computer vision is how to recognise objects from different viewpoints which have not been encountered during training. Our neural network model of the ventral visual system is able to accomplish this by first learning how elemental features in the environment transform across different viewpoints during early visual development (Stringer, S.M. and Rolls, E.T. (2002). Neural Computation, 14: 2585-2596).
(Left) Architecture of a 4-layer hierarchical neural network model of the ventral visual processing stream. Convergence through the network is designed to provide fourth-layer neurons with information across the entire input retina. (Right) Convergence through successive layers of the visual system.
Six visual stimuli with three surface features that occur in three relative positions. Each row shows one of the stimuli rotated through the five different rotational views, in which the stimulus is presented to the network. From left to right, the rotational views shown are -60 degrees, -30 degrees, 0 degree (central position), 30 degrees, and 60 degrees. To simulate early visual development, layers 1 and 2 are trained on pairs of surface features across all five views.Then layers 3 and 4 are trained on the complete stimuli at only four out of the five views.
Results from the neural network simulation after training. The figure shows the response profiles of a top layer neuron to the 6 stimuli across all 5 views. It can be seen that this cell has learned to respond invariantly to one of the stimuli across all views.The network has learned to discriminate between the 6 objects from all views, including the novel view not encountered during training.
Trappenberg, T.P., Rolls, E.T. and Stringer, S.M. (2002). Effective size of receptive fields of inferior temporal cortex neurons in natural scenes, Advances in Neural Information Processing Systems, 14: 293-300.
Higgins, I.V. and Stringer, S.M. (2011). The role of independent motion in object segmentation in the ventral visual stream: Learning to recognise the separate parts of the body, Vision Research, 51: 553-562.
Tromans, J.M., Harris, M. and Stringer, S.M. (2011). A computational model of the development of separate representations of facial identity and expression in the primate visual system, PLoS ONE, 6(10): e25616.
Evans, B.D. and Stringer, S.M. (2012). Transformation-invariant visual representations in self-organizing spiking neural networks, Frontiers in Computational Neuroscience, Vol 6, Article 46, pages 1-19.
Galeazzi, J.M., Mender, B.M.W., Paredes, M., Tromans, J.M., Evans, B.D., Minini, L. and Stringer, S.M. (2013). A self-organizing model of the visual development of hand-centred representations, PLoS ONE, 8(6): e66272.