Together with the supporting theories of Biederman

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Human beings have the abilities to perceive things round them, in terms of visual perception. Different people can in reality see the perception shift from the retina to the visual pathway in their mind. People, who are not picture thinkers, may not automatically perceive the shape-shifting as their world changes. When objects are viewed without understanding the mind will try to reach for something that it already recognizes, in order to process what it is viewing. That which most closely relates to the unfamiliar from our past experiences, makes up what we see when we look at things that we don't comprehend, (J.J Gibson's 1950 and Davey, Albery, Chandler, Field, Jones, Messer, Moore, & Sterling, 2005).

In this essay, finding out whether Marr's (1982) theory, together with the supporting theories of Biederman (1987) and Riddoch and Humphreys (2001), provide both a valid and a complete account of perception. The findings are discussed in light of flaws in the J.J Gibson (2001) and other related research.

An environment understanding of perception derived from Gibson's (1950) early work is that of "perception-in-action", the notion that perception is a requisite property of animate action, that without perception action would be unguided, and without action perception would serve no purpose. Animate actions require both perception and motion, and perception and movement can be described as "two sides of the same coin, the coin is action" (Davey et al 2005)

David Marr (1982) proposed a theory that was specified in sufficient detail to be simulated by a computer. According to Marr the brain computers depth in three stages, first, a 2-D primal sketch based on basic sensory information which is two-dimensional in nature and has information about line, corners and regions of similarity (similar areas that, therefore, probably belong to the same object). Then creates a 2.5-D representation, which has some depth information such as texture gradients and binocular cues as well as the orientation of the object. Final stage is a 3-D model, which represents the three dimensional nature of the objects in the scene (Davey et al 2005).

In addition the theory thought that colour information was processed by a distinct module and need not be involved in obtaining descriptions of the shape of objects and the layout of the environment. In reality, the modular nature of perception was a fundamental part of Marr's theory.

Marr used evidence from psychology and neurophysiology to build his model. The raw primal sketch stage extracts "features" from the input. There is much physiological and experimental evidence for feature detection in perception e.g. Hubel & Wiesel; visual search experiments. The full primal sketch stage uses Gestalt principles such as continuity and proximity to bind together features to produce a specification of the whole form (Davey et al 2005 and Braisby et al 2005)

The final three-dimensional representation is used to search memory for a match. If a match is found the object is recognised; if no match is found the object is not recognised. Marr's model seeks to produce a "view independent" representation of the object. What kind of evidence supports Marr? The degree to which his work is successful and support comes from other theories which use his basic ideas and which are supported by verification from experimental psychology and neuropsychology (Braisby et al 2005).

Marr and Nishara's (1978) also carry out a study of object recognition searching to make clear how the shape of a three dimensional object can be recovered for a two dimensional representation with added depth information 2.3 Dsketch. They argued that a central axis as a vital stage in the recognition process. Which means that thing should be extremely hard to recognize an object if it is also complicated to create the location of its central axis. There are lot of evidence that have supported their finding Lawson and Humphreys (1996) participants had to recognize objects (line drawings) that had been rotated. More support from Warrington and Taylor (1978) revealed that patient with damage to a particular part of the right hemisphere would recognize objects when they were current in a characteristic view but not when shown in an unusual view. Nevertheless Marr and Nishara's theory is a very small on empirical evidence of the way humans recognise objects, but their theory contribution to recognition of basic object and classes of object and how it continue on the basis of image present and how human object recognition can take place (Braisby et al 2005).

Marr's theory influenced Irving Biederman's (1987) recognition-by-components theory. The basis of this theory is the idea that objects consist of combinations of geons (abbreviation of geons is geometric Icon). Geons are three-dimensional building blocks, such as bricks, cylinders, wedges cones and their curved axis counterparts, which a total of 36. In the same sense that letters of the alphabet combine to make words, or features combine to make letter, so geons combine to form object. The early stages of processing are like Marr (stages 1 and 2) geons are (1) extracted from the 2.5-D sketch and (2) bound together into a 3-D structural description (stage 3 in models). 3-D structural description is compared with a catalogue of structural description stored in memory (stage 4) when a match is found recognition is achieved (stage 5). It could be argued that the theory is very similar to feature theory, but translated to three dimensions. The relationship between features and geons is also plausible. Geons are composed of features such as edges, angle, camber, and so on. The main problem with recognition-by-components theory is that there are countless objects in the natural world that are not without problems specified in terms of geons. A plant, the ocean and a beach are examples. Neither does it tackle the question of how we recognize precise examples of an object (the chair in someone living room) or face (my friend Sam). It does, nevertheless, go some considerable way to explaining how we perceive the three-dimensional world (Braisby et al 2005).

Furthermore, Riddoch and Humphreys (2001) model perception of object recognition is based on Marr's stages and has been successfully tested against neuropsychological evidence. The model was to locate neuropsychological impairments of visual perception in a cognitive model of perception and recognition. The model has been constructed to accommodate neuropsychological data which describes the deficits found in different kinds of visual objects agnosia. This means that it has been created to fit with the data collected from patients with neuropsychological impairments resulting from head injuries, strokes, and surgery and for this reason. Supplementary, the structure of the model reflects the fractionation of visual object agnosia into more specific such as shape agnosia, integrative agnosia, associative. Two neuropsychological cases which R&H used to support their theory, was a patient (HJA) who lost the ability to integrate the different components or features of an object, thus indicating the existence of a stage at which this is done i.e. the full primal sketch in Marr's terms. Neuropsychological (Warrrington & Taylor) patient who can't recognise objects from unusual views indicate that the ability to rotate a three dimensional representation is a key feature of object recognition (also supports Marr). Evidence from Milner et al (1991) which shows that a patient (DF) who could not make simple judgements of continuation and proximity, however the theory was well supported by neuropsychological evidence.

Unlike Marr, Biederman, Riddoch and Humphreys, Gibson's (1950, 1966 and 1979) theory of perception is completely the opposite of them all, because it does not have any stages of analysis or memory representation. His theory is ecological and direct. Perception is ecological since it uses all the information in the scene being perceived. Objects are not perceived in loneliness but in an environmental circumstance which contains information about depth, movement, contrast and function. He argued that perception is direct or in another word picked up information from the environment. This means that perception influences our action lacking any need for difficult cognitive processes to happen. The reason for this is that the information existing from environmental stimuli is much better than had previously been believed (Braisby et al 2005).

In his later work Gibson (1979) took this idea of information being picked up one step further and suggested that the end point of the perception process was not a visual description of the surrounding world, but rather that objects directly afforded their use. The affordance of an object is the function it offers itself for, whether it can be grasped, sat on, climbed up.

However, Gibson makes two statement regarding affordance that are rather harder to acknowledge and have established to be far more controversial. He states that affordance act as a bridge between perception and action and do not involve the intervention of any cognitive processes. He his saying that from the structure of the optic array, the observer can interact with surfaces and objects in the environment directly through affordance. There was no role for memory in perception, as the observer does not have to consult their prior experience in order to be able to work together with the world around them (Sterling, in preparation) .

In comparison, even if Gibson theory does not enlighten us as to the nature of the cognitive processes that are involved in perception, his theory has been extremely influential, and researchers in perception still need to bear in mind his criticism or the laboratory approach which makes use of artificial stimuli.

It was argued that Gibson approach is that it does not explain in sufficient detail how information is picked up from the environment. To address this problem, a theory was needed that attempted to explain exactly how the brain was able to take the information sensed by the eyes and turn it into an accurate, internal representation of the surrounding world (Sterling, in preparation)

They were some similarities between Marr and Gibson is that Marr theory suggests that the information from the sense is sufficient to allow perception to occur. Unlike Gibson, Marr adopted an information processing approach in which the processes responsible for analysing the retinal image were central. Marr theory is as a result strongly bottom-up, in that it sees the retinal images as the starting point of perception and travel around how this image might be analysed in order to produce a explanation of the environment. This meant that, unlike Gibson who was action as the end point of perception, Marr concentrated on the perception processes involved in object recognition (Braisby et al 2005).

In addition Braisby et al (2005) other evidence from cognitive neuroscience which suggesting that perception for action and perception for recognition may be compatible dual function of perception. Also others evidence suggesting that different visual pathways serve different function in object recognition.

There has been two or more visual pathway or distinct streams of information flowing back from the retina (via the optic nerve) into the brain. The interconnected streams of information flowing back from the retina to the primary visual cortex. One stream, leading to the inferotemporal cortex, is expression of the ventral stream, and the other, leading to the parietal cortex, is known as the dorsal stream (Shapley 1995).

The dorsal stream projects to areas of the brain that appear to be specialized from the analysis of information about the position and movement of object. Schneider (1967, 1969) conducted a work with hamsters which suggested that there were two distinct parts of the visual system, one system making pattern discrimination and the other with orientation in space. He recommended that one system is concerned with question, what is it? While the other one with the question where is it? Which than leads to later work (Ungerleider and Mishkin, 1982) shows that ventral pathway being labelled a 'what' system, and the dorsal pathway a 'where' system. Though there is still some evidence that the dorsal-ventral distinction is retain (Courtney et al 1996). There is a lot of work describing the two streams as what and where are suitable and that it is not quite straightforward. Milner and Goodale (1995) information of number of studies with a patient, DF, who suffered severe carbon monoxide poisoning that, appeared to avoid her using her ventral system for analysing sensory input. Their funding suggested that ventral system is knowledge-based and uses stored representation to recognize object, whilst the dorsal system appears to have only very short- term storage available (Bridgeman et al 1997: Creen and Proffitt, 1998).

In conclusion Gibson approach to perception strongly more on perception for the purpose of action, whilst Marr theory was mostly worried with object recognition, Biederman,s models is based on Marr work particularly done by Nishara and Marr trying to explain how we construct a three-dimensional representation of an object and finally Riddoch and Humphrey which is also derived directly from Marr. The stages they identify are very similar to Marr, modified somewhat because that is what the neuropsychological evidence show (Braisby et al 2005).

Furthermore all the theories have provide both a valid and a complete account of perception that perception is part of our everyday life the constructivist approach is also concerned with perception for recognition than perception for action, it is partly on how we might use presented knowledge to work out what an object capacity be (Braisby et al 2005). Despite the fact that these approaches have their difference, it is unquestionably the case that we need to both recognize objects and perform action in order to interact with the environment. At this time, ventral stream leads to the inferotemporal cortex and a dorsal stream to the parietal cortex. This evidence has shows that the ventral stream may be involved in perception for recognition and the dorsal stream in perception for action. In other words the dorsal stream would be better at dealing with the type of perception dealt with by Gibson and the ventral stream with the type of perception dealt with by Marr and other approach in perception.