“The question of why we see things the way we do in large measure still eludes us” (p.341). This article explores the elusive relationship between vision and cognition. Is what we see solely a function of the stimulation we receive on our retina, or is it “cognitively penetrable” and influenced by what we “expect” to see? Furthermore, does what we see change our beliefs and representations of the world? Pylyshyn’s hypothesis is that early vision encompasses the stage of vision in which computational and top-down processing produces a 3D-image, but this stage is impenetrable to cognitive influence. Rather, cognition is constrained to two points: 1) pre-perception allocation of attention and 2) post-perception pattern-recognition decisions.
Questioning the Continuity Thesis
Pylyshyn puts forward four reasons for questioning whether vision is continuous with cognition. First, he argues that perception is resistant to rational influence. To support this claim, he cites optical illusions – even when you “know” two lines are the same length, you perceive them as uneven. Second, he explains that the principles of perception differ from principles of inference, which follow rational rules of reasoning. Principles of perception are responsive only to visually presented information; they do not reflect simplicity. For example, even if parts of an image are blocked, we can picture what is behind the blocked areas quite accurately. Therefore, these principles are “insensitive to knowledge, expectations, and even to the effects of learning” (p.345). Third, Pylyshyn cites clinical evidence from neuroscience that suggests at least partial dissociation of cognitive and visual functions. Finally, Pylyshyn presents methodological arguments that acknowledge the observed effects of expectations, beliefs, and contextual cues, but designates these influences to stages of processing that lie outside of what is called “early vision” (pg. 345).
Arguments For/Against Continuity
Pylyshyn explores three fields which have also explored the continuity thesis. First, he considers progress in artificial intelligence, in which the goal is to design a system that can “see.” The major progress that has come out of this field has been the development of a knowledge-based/model-based systems approach. These systems use stored general knowledge regarding objects to help determine whether said object appears in the field of view. Though this supports the continuity thesis, Pylyshyn argues that in addition to these systems, there needs to be development of systems that contain constraints on interpretation, which are congruent with the impenetrability thesis.
Next, he introduces discoveries in neuroscience indicating that attention can “sensitize or gate the visual field” (p. 347). This research partially favors the continuity theory, as there is evidence for top-down effects that modulate attention. However, there have been no discoveries of cells that influence the interpretation or emotional part of vision. Pylyshyn argues that what they illustrate is a pattern/motion response, not a content (cognitive) response.
Finally, Pylyshyn offers examples from clinical neurology in which pathology, specifically visual agnosia, strongly indicates a dissociation of vision and cognition. Upon impairment of one of the systems, the other continues to function. This view contradicts the continuity theory. Pylyshyn concurs with this evidence, although admits that these observations could certainly be correlational, not causal.
Determining Visual Stages: Methodological Issues
Pylyshyn acknowledges that all evidence provided above has been highly debated due to the complexity of the visual process. He highlights the importance of clarifying the phases of vision to better interpret these and other findings. He reviews methodological problems associated with distinguishing the stages of the visual process. The signal detection theory explores the ways in which humans make decisions about and respond to stimuli. Two phases of vision have been defined: a “perceptual” phase recognizes a stimulus (cognitively impenetrable), and a “decision” phase formulates a response (cognitively penetrable). The former is represented by the sensitivity parameter (d’), which represents the statistical relationship between the presence of the tone and a person experiencing the tone. The latter is represented by the response bias/criterion measure (ẞ), which represents the statistical relationship between the presence of the tone and the formulation of a response. While this interpretation locates such effects in a post-perceptual stage, Pylyshyn argues that this decomposition is generally too coarse to accurately establish when cognitive influences play a role.
It is important to note that the way we usually determine detection of a signal (sensitivity) is through a response (criterion measure), and therefore, being able to determine the location of cognition fails here as well. However, one way to compensate for this is through measurement of event-related potentials, which allows for measurement of stimulus evaluation uncontaminated by the response decision-making process. Pylyshyn also finds fault with this method because it encompasses everything except the response selection process, including memory retrieval for recognition, decisions, and inferences. He argues that “we need to make further distinctions within the stimulus evaluation stage so as to separate functions such as categorization and identification, which require accessing memory and making judgments, from functions that do not” (p. 351).
He concludes this section by further questioning the relationship between the stages of perception and sensitivity, which seems to be rather inconclusive. It is obvious that we need to determine a sort of mechanism that can lead to specificity in our sensitivity. Therefore, we need some sort of filtering to formulate the hypothesis generation stage of visual perception.
Constraints and Attention
Pylyshyn discusses several examples in which vision “appears on the surface to be remarkably like cases of inference” (p. 354). In these cases the visual system appears to “choose” one interpretation over other possible ones, and the choice appears remarkably “rational” (p. 354). However, Pylyshyn insists the examples do not actually constitute cognitive penetration for two reasons. First, he argues against cognitive penetration of natural constraints, which “are typically stated as if they were assumptions about the physical world” (p. 354). Natural constraints fail to demonstrate inference because the visual system evolved to work as it does, and principles of the visual system are internal to the system. They are neither sensitive to beliefs and knowledge, nor accessible to the cognitive network (p. 355). Next, Pylyshyn argues that cases of so-called perceptual “intelligence” and “problem-solving” also fail, and for the same reasons as natural constraints (also p.355). Specifically, the visual system often fails a simple test of rationality when certain basic facts about the world known to every observer.
How Knowledge Affects Perception
Pylyshyn discusses apparent counterexamples to the discontinuity theory. While he acknowledges the value of these as having an impact on response time, he asserts that the improved response is the result of knowing where to look or what to look for, limiting cognition to the pre-perceptual and post-perceptual stages. Hints of finding meaningful images do not actually help and does not affect the content that is seen (which is required for cognitive penetration). “Expert perceivers” (p. 358) appear to have knowledge that increases their ability to perceive certain patterns and with increased speed, but this seems to result solely from learned classification of visual patterns to enhance recognition and identification. He argues that this is part of the post-perceptual process. Additionally, although findings show that what people see is altered through experience, he argues that cognition plays a role only in pre-early vision by indexing spatially relevant locations. This development of focal attention is an important mechanism by which vision is malleable to the transient external world, and represents the main interface between vision and cognition. Further, he allows that cognition can influence post-perception decision making based on knowledge and experience (although with practice, this can become automated and cognitively impenetrable).
Output of Visual Systems
Pylyshyn makes the case for the visual system being a single system with two outputs, neither of which is knowledge-dependent.
He defines “early vision” from a functional perspective, as the “attentionally modulated activity of the eyes” (p. 361). In this way, he’s acknowledging that early vision happens after attentional gating, and depends on inputs from other modalities, including non-retinal spatial information. Pylyshyn presents research showing that the output of the visual system in categories, such as shape-classes. Pylyshyn outlines that we form a 3D representation of surfaces (independent of knowledge), encode the layout of a scene (again, without knowledge or reasoning), and perceive a set of surfaces in depth. However, he stresses that “computing what the stimulus before you looks like…does not itself depend upon knowledge” (p.361). Identifying a shape is not the same as recognizing what it actually is. For that, we need to draw upon other knowledge in memory and perform top-down processing, which can form a bridge between seeing the shape and knowing what it represents.
A second type of visual output is that which affects motor actions. These, too, Pylyshyn argues, happen separately from cognition, and he uses examples from clinical neurology to illustrate the “fractionation of the output of vision” (p.362) that allows patients to act as if they can see even when they don’t “think” that they can.
Questions
- If vision is not continuous with cognition, does that suggest that perception cannot constitute basic beliefs, thus undermining Foundations Theory?
- Is Pylyshyn’s hypothesis consistent with Traditional Epistemology or Natural Epistemology?
- Given the information put forth in this article, as scientists, do you think that we can accurately “observe” any visual data?
- Do you believe that there are two separate forms of visual output? If so, where do you think the motor-function output is derived from?