Introduction

Pylyshyn’s article “Is vision continuous with cognition? The case for cognitive impenetrability of visual perception” (1999) received a large number of commentary responses, and although some commentators spoke in agreement with Pylyshyn’s claim that one aspect of vision (“early vision” as dubbed by Pylyshyn) is independent of cognitive penetration. He observed that, of the 40 some odd commentaries, there seemed to be three general response themes. 1) There really is no point in distinguishing vision from cognition, either wholly or partly, for a host of reasons ranging from unimportance (with the argument that consciousness or conceptualism are more important arguments) to the notion that it will always be impossible to decide or prove whether or not Pylyshyn is correct or incorrect; 2) Our current understanding of neurophysiology does not fully support Pylyshyn; 3) Pylyshyn’s idea is mostly valid, except that its points of reasoning may not be as tight as they could be if drawn from other points. Pylyshyn continues to, conveniently, address his commentators in the following six sections.

Distinctions and decidability

Pylyshyn defends his premise that vision is a simplistic structure: it’s binary, and an input-output system (402, 404). While Uttal and Dresp argue that vision cannot possibly be as simple as input and output due to the complexity of the brain—which also makes it immeasurable—Pylyshyn attests that even complex structures can be broken down and that, at this time, we may not have the tools to measure the visual pathway’s binary-ness (402). But that does not mean the empirical science will not one day be developed and the premise will be decidable. As for vision as a binary, Pylyshyn asserts that a modal system would require the representations and knowledge developed by the visual system to make a modal model, yielding cognitive penetration (404). That visual “representations require representation-interpretations processes” (403) is a circular argument that Pylyshyn disassembles with his idea that representations do not occur yet in early vision because representations are cognitively penetrable. Likewise, he asserts that visual apprehension is not knowledge-dependent (as Pani and Schyns suggest) because of the neural pathway it takes; early-vision itself does not interact with cognitive pathways. Rhodes and Kalish’s method of distinguishing early vision from post-visual decision, Signal Detection Theory (SDT), is not effective enough to support or deny Pylyshyn’s argument (404). Additionally, the color testing proposed by Schirillo does not control for the cognitively penetrable effects of memory, and therefore nullify his claim that color memory penetrates early vision (404-405).

 

Top-down effects versus cognitive penetrability

Once again, Pylyshyn fields heavy criticism that he underestimates the complexity of the brain. He replies to Tsotsos, Grunewald, and Bullier’s critiques that early vision is not exempt for top-down effects by asserting that in a complex system, of course there are “interlevel influences” (405), but these cognitive biases are formed before or after early vision and therefore are not pure early vision and cannot affect the raw perception of stimuli. Grunewald continues to argue that delayed stimuli cognitively penetrate early vision (406), although this requires a form of memory which Pylyshyn previously argued separate from early vision (404-405). Tsotsos brings forth the concept of other sensory modalities as penetrators of early vision, causing a cross-modality integration that founds cognition; Pylyshyn is quick to dismiss this as not occurring at the early vision level and, once again, denies the modality aspect (406). Attention, a subject to which he devotes a section later, is a cognitive intervention that direct focus, but it does not change perception within that visual field and is not a top-down process (405). Similarly, he denies attention shifts and anticipation as cognitive penetrators, only directors of focal point and pre-visual effects. Neuroimaging by Rhodes and Kalish seem to imply that activation of the v1 region during vision is indicative of cognitive penetration, but Pylyshyn says, “There remains a big gap between showing that there is activity in a certain area during both vision and mental imagery and showing that the activity represents the influence of cognition on perceptual content” (406).

Where is the boundary of early vision?

Several commentators (e.g. Cavanagh, Bermudez, Gellatly) have alluded that a more clear distinction should be made in the definition of early vision by examining the role of conscious and unconscious cognition. Noe and Thompson in particular believe Pylyshyn is implying that all of vision must be unconscious if it is cognitively impenetrable. However, Pylyshyn believes that what separates early vision from other visual counterparts does not involve a criterion of unconsciousness. He does not commit to any opinion about the relationship between consciousness and vision because there is nothing conclusive about the nature of consciousness. He believes that some neural computations by the brain are available to the subject in the form of consciousness and some are not.

 

Aloimonos and Fermuller suggested that Pylyshyn did not expand upon early vision enough to define its boundary, such as including “depth, orientation, and aggregation information concerning visible surfaces” (408). Pylyshyn agrees to these suggestions along with others and enunciates that most visual properties, like luminance and texture, that are detected by a sensitive visual system are not available to cognition.

Other commentators like Hollingworth and Henderson, Rosenfeld, and Sanocki provide supporting remarks that the boundary of early vision might actually be occurring at a higher level than Pylyshyn originally proposed. Hollingworth and Henderson propose that object recognition involves the pre-semantic matching of the information from early vision and memory which can occur in early vision. Rosenfeld and Sanocki suggest that early vision may have a built-in memory database already embedded that allows for ease of rapid recognition. Pylyshyn agrees with their idea that the boundary of early vision may be expanded but asserts that he sees no need for the involvement of memory. Instead, he proposes that early vision assigns objects into categories based on similar visual properties, but such categorization is not and does not require cognitive memory because there is no judgment being made about the object itself.

 

Papathomas in his commentary provides a visual example in which observers can have alternate perceptions of 3D shape representations but only after given additional suggestions, which suggests cognition does influence early vision because one of early vision’s outputs is 3D shape representation. Pylyshyn finds his example interesting but sees it more as cognitive influences reaching the post-perceptual phase of vision. For ambiguous shapes and figures, early vision might provide the possible visual interpretations, but selecting one interpretation over another is beyond the scope of early vision.

Peterson argues that there are many types of cognition, and there is evidence to suggest that some types influence vision, particularly subsets of intention and knowledge. In one experiment, she used stereograms to show that intention affects early vision and not post perceptual decisions or eye movement. Pylyshyn interprets her findings to suggest that focal attention in the form of intention is occurring outside of early vision. Early vision, he reiterates again, is generating alternative interpretations, but it is focal attention that mediates the perception.

 

The nature and role of attention

The idea of attention seemed to be more of a semantic debate with Pylyshyn’s commentators. Sowden, Cohen and Kubovy, Schyns, and Rhodes and Kalish struck up conversation on the definition and usage of attention. To summarize Pylyshyn’s replies: attention is object-based, does not require representations to define itself, and must be defined simply in order to avoid a cognitive penetration problem.

Pani brings up the question of how spatial phenomena relate to attention, to which Pylyshyn does admit we do not have a complete grasp on focal attention, but given that patterns of perception are predictable, attention can follow suit (410). Modalities of focal vision and attention are not cognitive penetration, as Yen and Chen suggest, since they are nor used in depth until later processing (410), and Egeth’s suggestion of pictures as localization cues is simply that: cues for attention, not cognitive penetration (411).

 

Comments on the role of natural constraints

Dannemiller and Epstein seem to think that Pylyshyn uses natural constraints to explain away the inverse problem of vision, but the natural constraints he describes cannot explain complex perceptual phenomena. Pylyshyn admits that individual constraints can be overcome but natural constraints are just one solution to the inverse vision problem.

Yu believes that “it is likely that by embodying reliable relationships in the world, perceptual processes can gain greater efficiency with minimal cost,” supporting Pylyshyn’s proposition of natural constraints (401).

Hollingworth and Henderson argue that context effects should be considered as having the same status as natural constraints because contextual environments can also have an effect even if the resulting representation is inaccurate. Pylyshyn disagrees and says that a natural constraint is always held true with some exceptions because a natural constraint is part of the vision system architecture. Context effect, on the other hand, is not structurally sound like natural constraints.

 

Subsystems of vision, other modules, and other modalities

As Vallortigara pointed out, early vision must consist of independent sub-processes that interact with each other to create a holistic interpretation of visual input. Others like Bowers, Bruce et al. offer examples of how early vision can be cognitively impenetrable and also interact with other impenetrable systems, such as language and facial recognition.

Gentaz and Rossetti take a different stance by relating Pylyshyn’s theory of discontinuity between vision and cognition with haptic perception. They assert that the cognitive impenetrability of haptic perception suggests a possibility that the sensory systems in general are all impenetrable. Pylyshyn takes issue with their argument for several reasons. 1) He does not see how his thesis is related to their observation because it seems to him that the manual moving of one’s hands across surfaces and objects is an example of cognitive effect. 2) He did not claim all sensory systems are impenetrable. 3) Haptics does not appear to be a single perceptual modality.

 

Conclusions

Pylyshyn does not refute that cognition influences vision at specific stages for perception to arise, only that a portion of the vision system known as early vision is independent of aspects of cognition. He ends by saying that what he proposed is a working hypothesis but by no means a final analysis on the subject matter and only further research will be able to elucidate the relationship between vision and cognition.

 

Questions

  1. If early vision can form basic beliefs, is it incorrigibly or prima facie justified? How do visual disorders factor into justification?
  2. Pylyshyn shies away from the topic of color in early vision, and instead takes on color memory. Does it seem plausible for early vision to contain more than geometric capacities, but color capacities, as well? What would this mean for cognitive penetration?  
  3. The relationship between early vision and memory is a point of contention in the commentary and Pylyshyn’s response. Taking into consideration what he says are the outputs of early vision, is memory isolated from early vision?
  4. Does Pylyshyn’s discontinuity theory between vision and cognition give grounds to naturalize epistemology? If so, in what way (e.g. replace, transform, separate)?

10 thoughts on “Pylyshyn: Commentaries and Pylyshyn’s Response – Annie Ly and Chelsea Montello

  1. Similar to some of my peers’ comments I will like to further discuss the role of consciousness and unconsciousness in attention, perception, and cognition and primarily in early vision. Gellatly states in page 378 that “what you expose your sense organs to determines your conscious experience”, but if for example “neurons in the inferior temporal cortex respond to auditory stimuli only if they are paired with visual stimuli and not otherwise” (380), what does this tell us about the role of our conscious and unconscious perception and its relationship to “selective attention” and early vision? How selective is one’s attention allowed to be, given its linkage to perception and cognition? How does that “selectivity” link back to consciousness/unconsciousness?

    I found Bowers’ commentary particularly interesting for it refers to vision and visual representations outside of geometrical shape categorizations. Specifically, since “perceptual categories for words are not structured on the basis of geometrical similarity and must hence lie outside early vision” (368). With this in mind, I’m also curious on where the boundaries- or the simple categorization of early vision lies. Is the bi-directional relationship between orthography and phonology that Bowers’ provide a “simpler” example of early vision or do you believe it actually is “outside early vision”? Identification- even to a basic level has been at the foundation of the arguments of what constitutes early vision, (basic identification of a shape in early vision; identification of different visual pattern maps). This gives rise to questioning the role of attention and cognition given the role of perception and early vision categorization. Having this in mind then, I would agree with my peers in questioning not only the impenetrability of early vision, as Pylyshyn argues, but also the defining of early vision.

    Pylyshyn also” links cognition and cognitive-style processes solely to consciousness and to reportable knowing” (371). However, as Bullier states, evidence from neuroscience indicates that early visual responses are strongly modulated by the intention of the individual to make a movement to a target or by the categorization of visual stimuli (370). Once again, however, we can question the criteria of the differences of processes such as the role of consciousness and unconsciousness. Although, I think Pylyshyn appropriately cover most of the peer commentators’ thoughts, I would agree that he dismissed many concepts in borderline, which need clarification and further studying.

  2. I’m going to further elaborate on Devon and Porter’s ideas that there may be a lack of understanding of the sensational and perceptual pathways that leads to different operational definitions of early vision. Dresp claims: “[evidence for early vision] is impossible to provide because we can only infer what might happen at these earlier levels of processing on the basis of evidence col- lected at the post-perceptual stage” (375). Schyn points out that there may be another function of early vision that is indeed cognitively penetrable called scale perception (394). In this function, spatial filtering is shown to be altered by cognitive influence as given by the examples in Schyn’s figure 1. Pylyshyn responds at first by saying that this all depends on how one defines early vision (403) and later that he categorizes focal attention as defined by Schyn differently (411). Rather than consider the possibility of early vision to be more complex than originally proposed—which would make it susceptible to cognitive penetration—he simply asserts his definition to solidify his point. With so many proposed ideas of an early vision without a complete understanding of when it occurs in signal detection, can Pylyshyn fairly contend that “early vision” is impenetrable? Does there need to be a fully understood concept of perception before defining where cognition plays a role, or can there be several definitions working toward understanding if cognition plays a role in vision?

    Gentaz and Rossetti bring up a point that was brought up in our class about other sensations possibly being penetrable (378). Pylyshyn fairly contends this idea by considering it irrelevant to his argument about vision (413) but this brings up some new ideas: Can other senses such as the haptic sense lead to basic beliefs if vision can’t?

  3. Of the numerous and varied peer responses to Pylyshyn’s original target article “Is Vision Continuous with Cognition?”, many left me with hanging questions but I’ll address just one of them – Cavanaugh’s “The cognitive impenetrability of cognition” on pages 370-371. Cavanaugh states that by Pylyshyn’s definition, perception is irrational because it can draw a conclusion despite that person having contrary knowledge (370-371), as is the case with many visual illusions. He uses this argument to draw a sharp distinction between the visual system (including perception) and cognition. Cavanaugh’s distinction is this: the visual system is separate from cognition due to the higher speed with which it can process information (371). Visual systems, he states, can sense, integrate knowledge, and form perceptions very quickly whereas cognitive processes perform the same operations but do so much slower (371). If visual systems can integrate knowledge to form perceptions, how is this different than cognition “penetrating” vision? Is this a simply a top-down versus bottom-up distinction? If so, it seems that vision is an operation that does involve cognition, but whether cognition is the start or end point of any perception arising from vision is a binary distinction rather than a debatable theory.

    To this, Cavanaugh replies that “clearly, perception and cognition have access to different knowledge bases” because things we are consciously aware of – that is, able to articulate – don’t directly influence perception (371). He is arguing that the top-down and bottom-up processes are not drawing from the same “knowledge bases”, that they are not simply the same mechanism running in opposite directions. Pylyshyn’s definition of cognition as given by Cavanaugh is interesting to contrast. Pylyshyn’s cognition is “when a system’s output can be altered in a way that bears some logical relation to what that person knows” (371). I interpret the “system output” to mean vision, and the alteration being cognition. This is a definition of top-down processing of visual stimuli.

    I agree with Crassini, Broerse, Day, Best, and Sparrow (372-373) that the arguments of Cavanaugh and Pylyshyn create a distinction where there does not need to be one (372). The brain is one mechanism, and though there are a number of pathways contained within its systems, all knowledge ultimately draws on the same brain. Additionally, Crassini et al. identify a crucial problem that would arise if one set out to test Pylyshyn’s hypothesis empirically: that one must first delineate cognitively penetrable and noncognitively penetrable processes in order to test them separately and would run into a circular problem (372). Due to these difficulties, is such an experiment “ultimately futile” as Crassini et al. believe it to be?

  4. I, like Porter, was also troubled at the way Pylyshyn so cavalierly dismissed the arguments of Cavanagh and others who criticize his lack of discussion of the role of consciousness in perception, cognition, and early vision itself. Pylyshyn argues that vision is not always rational and that this evidence supports his theory that a part of visual processing is cognitively impenetrable. He uses the example that, despite “knowing” that the two lines of the Muller-Lyer illusion are the same length, we persist in seeing them as two different lengths. Cavanagh argues that it can only be “the verbally responsive, conscious part of the person that “knows” the lines are equal” (371) because the visual system would report that the two lines are different lengths. He concludes that Pylyshyn’s above argument has linked cognition and its processes solely to consciousness and reportable knowing (371). While Pylyshyn does make a strong argument against Cavanagh that what we “know” cannot be understood simply as what we are conscious of or what we can explain, I do believe that Cavanagh is making arguments that are worthy of and relevant to the discussion. Can we truly talk about perception and cognition without also talking about consciousness? Would the processes of early vision be considered unconscious or conscious?

    I too believe that the nature of consciousness, and its relationship to perception and cognition, warrants more study, mostly because I found myself confused and bombarded by differing opinions as I read this piece. Cavanagh states that, “cognition is [not] restricted to conscious events” (371), but Noe & Thompson say, “if the output of early vision were conscious, it would be cognitively impenetrable” (407). Are all conscious thoughts and actions therefore considered cognition but not all aspects of cognition considered conscious? As Porter already stated, it would have been nice to have Pylyshyn step in and discuss these opinions, but he was not interested.

    Finally, I also liked Pylyshyn’s connection of early vision to Rosenfeld’s proposed generalized model. He concludes that the computations of shape-classes by early vision operate precisely as Rosenfeld’s model, which is by “rapidly extract[ing] from our visual images features that have sufficiently rich descriptions to be useful for indexing into long term memory” (409). In other words, I benefited from the explanation that the role of early vision is to provide us with shape-classes that reduce the time it takes for us to recognize an object once the output from early vision moves into our post-perceptual vision and becomes cognitively penetrable. Despite this initial agreement, Rosenfeld believes that this process could “involve cognitive penetration into the early vision system itself” (409), a statement that Pylyshyn dismisses based on empirical evidence that he does not go on to describe. This leaves me with a couple questions. Are early vision and post-early vision really as distinct as Pylyshyn describes? Do you not need past memories or representations formed by cognition to create shape-classes? Could this be evidence that somehow these two parts of visual processing permeate and influence each other?

  5. Shifting gears somewhat…

    I was particularly intrigued by Bruce, Langton, and Hill’s discussion of face perception, which appears early in the Open Peer Commentary section. If there indeed exists a primitive face-detecting system, is this simply an extension of what Bermúdez terms non-conceptual computation? (Representations of one’s environment are non-conceptual if they do not require use of classificatory/recognitional abilities. So, an infant’s extending his/her arm towards an object, for instance, requires no cognitive concept of distance.)

    In the context of face detection, then, is any face merely reducible to such non-conceptual parameters as position, contour, color, size, etc.? That seems unconvincing, as it’s unlikely that all possible variations of faces, including those we’ve never encountered, would be “programmed” into our early vision system. (Right?) Yet minutes-old infants can track face-like patterns with their eyes—without possessing any beliefs or real knowledge of the world “out there.” (This observation is consistent with the evolutionary arguments Pylyshyn alludes to in the target article.)

    I’m having a lot of trouble reconciling these two points. Perhaps there exists a primitive face-detecting system, but it is updated with experience? That would mean that either EV is indeed cognitively penetrable, or that the face-detection system is not located in EV. What do you guys think?

  6. While I believe Pylyshyn generally provided a respectable and thorough response to his peer commentators, I struggled with his claim that he has “very little to say to those who would rather focus on borderline cases and on the problem of providing necessary and sufficient conditions for the distinction” (402). As I interpreted this, the distinction he is speaking about is between stages of vision that are cognitively penetrable, and those that are not. As borderline cases often bring up fine-detail questions that help to think more deeply about the differences between the categories, aren’t borderline cases, such as the case of the blind spot that was discussed in Tuesday’s class, critical for determining whether or not there is any stage of vision that is not cognitively penetrable?

    Like Steven and Marisa, I find his assertion that shape identification during the early vision stage can occur without any sort of cognition quite troubling. As Peterson suggests on page 389, “early vision does seem impenetrable by beliefs and inferences, but it seems quite penetrable by some types of knowledge and perceptual goals.” Some of her experiments demonstrated that early vision involved in processes of depth segmentation required the knowledge and memory of the structure of familiar objects. Therefore, I question Pylyshyn’s argument that early vision is completely impenetrable. If this is the case, what kinds of knowledge and cognition are influential in the stage of early vision?

    Finally, Pylyshyn is very quick to dismiss the neuroimaging data presented by Rhodes and Kalish that shows activation of visual areas to demonstrate empirically that vision is penetrable. While I believe that the time at which this activity begins and ends needs to be more clearly defined according to the stages of vision that Pylyshyn presents, I think it is important to consider this data, as it is one of the few ways in which we can observe neurons firing, I think it is important to think: how does neuronal firing related to cognition? Is it cognition?

  7. I’m left feeling a rather naïve reader. I was at first overwhelmed and convinced by Pylyshyn’s target article, then dismayed and defeated by his detractors, and again swept away by his resounding dismissals of them. What a roller coaster. However, there were two issues I do not think Pylyshyn adequately addressed.

    First, Pylyshyn’s statement “The only conclusion about consciousness I am committed to is that we can make nothing of it for the purposes of building a scientific theory because we have no idea what it is” (p 407) seemed a bit cavalier to me. How can he be unwilling to even engage in an exploration of the topic, given how stridently he dismissed arguments against his theory that questioned the empirical decidability of aspects of his thesis, saying “I find that entire line of reasoning to be both otiose and tiresome (p 401)? Is science knowable through research or not? Seems like he can’t say one is worth exploring and the other is unknowable.

    I do think distinctions of consciousness and perception are relevant to the discussion, and that more study is warranted, as called for by Cavanagh: “There are undoubtedly profound differences between vision and cognition but Pylyshyn has not identified differences in process, only differences in access to knowledge. What is needed is a description of the specific procedures which are unavailable to unconscious vision. (p. 371). Perhaps I am an “optimistic Natural Epistemologist,” but I think we should continue to push science to find these answers, and believe that we will inch ever closer to understanding. Again, I wonder what additional scientific research has unveiled even since these articles were published in 1999.

    Second, I agreed with Peterson’s assertion that cognition comes in many varieties, (knowledge, beliefs, goals/intentions, inferences) and that more study is needed to clarify cognition’s effects on stages of perception (p 389). Pylyshyn dismisses Peterson’s argument by saying she agrees with him, and indeed that he used her work to develop his hypothesis. But he fails to acknowledge that her argument wasn’t about whether or not cognition affected focal attention but about the boundaries between EV and pre-perceptual focusing as well as recognition that different types of cognition (knowledge, beliefs, goas/intentions, and inferences) may have different degrees of influence all along the path of vision from pre-perceptual to output.

  8. In the target article, Pylsyhyn contends that the information provided by early vision is independent of cognitive factors. While Rosenfeld agrees that this is reasonable for “general” or “default” information (useful for such task as gaze control, motor control, and navigation), he argues that more specialized information (needed for recognizing objects) is cognitively penetrable. Rosenfeld argues that this specialized information cannot be “found solely through cognitive control of focal attention and stimuli identification” and that “even the initial detectability of the features may depend on expectations” (391-192). Therefore, cognitive penetration may occur in the early vision system.

    Like Steven mentioned in his post, I too am curious about where the boundary between early vision and later cognitive processing lies. In the summary paper, Annie and Chelsea mention, “Hollingworth and Henderson propose that object recognition involves the pre-sematic matching of the information from early vision and memory which can occur in early vision. Rosenfeld and Sanocki suggest that early vision may have a built-in memory database already embedded that allows for ease of rapid recognition.” Does early vision construct a “shape-class” or “shape-category”? Furthermore, is it this organization that may make a “built-in memory database” and rapid object recognition possible? If so, is this memory database accessed during early vision?

    I am particularly interested in Rosenfeld’s commentary because he uses model-based computer vision systems to argue the need for the construction of these shape classes in early vision. From my understanding of computer science, without creating these classes, it would be nearly computationally impossible to look up all visual information in a memory database. These classes reduce the number of objects that need to be looked up and compared to stored descriptions of “known” objects. However, is important information needed for object recognition lost when these classes are constructed?

  9. The flaw in Pylynshyn’s argument is not that what he is proposing is entirely fallacious, but instead just the time frame in which he is proposing cognitive processing occurs. While the timing of the process is inadequate, I am curious as to how we can concede that because one element of perception is cognitively impenetrable, there exist drawbacks with knowledge-based theories of perception.

    First, while I would like to believe in Pylynshyn’s differentiation between early vision and cognitive processing, Hollingworth and Henderson clearly assert that what we consider to be post-cognitive processes are capable of occurring during early vision. The authors communicate, “We propose that a constructed object description is matched to stored descriptions pre-semantically, with a successful match allowing access to semantic information about that object type” (381). If we concede that a post-cognitive process is occurring during the early vision stage, are we also acknowledging that the early vision stage is not purely physical, and is possibly even subjective? Does this mean that even early vision may be cognitively penetrable?

    Furthermore, even if the above argument is not valid, what makes the cognitive impenetrability of a sensation equal to the cognitive impenetrability of perception? Moore proposes, “This is because “early vision” – the only aspect of visual processing that was defended as cognitively impenetrable – is only one component of visual perception, and it is not necessarily the one with which knowledge-based theories are concerned. Thus, no matter how successful an argument concerning the cognitive impenetrability of early vision may be, it would fail to undermine the basic tenets of most knowledge-based theories of perception”(385). Why do we reduce knowledge-based theories of perception when there is no clear evidence that early vision is perception?

Leave a Reply