Visual Comprehension
How do comprehenders build up overall meaning representations of visual real-world events? This question was examined by recording event-related potentials (ERPs) while participants viewed short, silent movie clips depicting everyday events. In two experiments, it was demonstrated that presentation of the contextually inappropriate information in the movie endings evoked an anterior negativity. This effect was similar to the N400 component whose amplitude has been previously reported to inversely correlate with the strength of semantic relationship between the context and the eliciting stimulus in word and static picture paradigms. However, a second, somewhat later, ERP component—a posterior late positivity—was evoked specifically when target objects presented in the movie endings violated goal-related requirements of the action constrained by the scenario context (e.g., an electric iron that does not have a sharp-enough edge was used in place of a knife in a cutting bread scenario context). These findings suggest that comprehension of the visual real world might be mediated by two neurophysiologically distinct semantic integration mechanisms. The first mechanism, reflected by the anterior N400-like negativity, maps the incoming information onto the connections of various strengths between concepts in semantic memory. The second mechanism, reflected by the posterior late positivity, evaluates the incoming information against the discrete requirements of real-world actions. We suggest that there may be a tradeoff between these mechanisms in their utility for integrating across people, objects, and actions during event comprehension, in which the first mechanism is better suited for familiar situations, and the second mechanism is better suited for novel situations.
Origins of impaired adaptive functioning in schizophrenia remain poorly understood. Behavioral disorganization may arise from an abnormal reliance on common combinations between concepts stored in semantic memory. Avolition-apathy may be related to deficits in using goal-related requirements to flexibly plan behavior. The authors recorded event-related potentials (ERPs) in 16 patients with medicated schizophrenia and 16 healthy controls in a novel video paradigm presenting congruous or incongruous objects in real-world activities. All incongruous objects were contextually inappropriate, but the incongruous scenes varied in comprehensibility. Psychopathology was assessed with the Scales for the Assessment of Positive and Negative Symptoms (SAPS/SANS) and the Brief Psychiatric Rating Scale. In patients, an N400 ERP, thought to index activity in semantic memory, was abnormally enhanced to less comprehensible incongruous scenes, and larger N400 priming was associated with disorganization severity. A P600 ERP, which may index flexible object-action integration based on goal-related requirements, was abnormally attenuated in patients, and its smaller magnitude was associated with the SANS rating of impersistence at work or school (goal-directed behavior). Thus, distinct neurocognitive abnormalities may underlie disorganization and goal-directed behavior deficits in schizophrenia.
Just as syntax differentiates coherent sentences from scrambled word strings, the comprehension of sequential images must also use a cognitive system to distinguish coherent narrative sequences from random strings of images. We conducted experiments analogous to two classic studies of language processing to examine the contributions of narrative structure and semantic relatedness to processing sequential images. We compared four types of comic strips: (1) Normal sequences with both structure and meaning, (2) Semantic Only sequences (in which the panels were related to a common semantic theme, but had no narrative structure), (3) Structural Only sequences (narrative structure but no semantic relatedness), and (4) Scrambled sequences of randomly-ordered panels. In Experiment 1, participants monitored for target panels in sequences presented panel-by-panel. Reaction times were slowest to panels in Scrambled sequences, intermediate in both Structural Only and Semantic Only sequences, and fastest in Normal sequences. This suggests that both semantic relatedness and narrative structure offer advantages to processing. Experiment 2 measured ERPs to all panels across the whole sequence. The N300/N400 was largest to panels in both the Scrambled and Structural Only sequences, intermediate in Semantic Only sequences and smallest in the Normal sequences. This implies that a combination of narrative structure and semantic relatedness can facilitate semantic processing of upcoming panels (as reflected by the N300/N400). Also, panels in the Scrambled sequences evoked a larger left-lateralized anterior negativity than panels in the Structural Only sequences. This localized effect was distinct from the N300/N400, and appeared despite the fact that these two sequence types were matched on local semantic relatedness between individual panels. These findings suggest that sequential image comprehension uses a narrative structure that may be independent of semantic relatedness. Altogether, we argue that the comprehension of visual narrative is guided by an interaction between structure and meaning.
Constituent structure has long been established as a central feature of human language. Analogous to how syntax organizes words in sentences, a narrative grammar organizes sequential images into hierarchic constituents. Here we show that the brain draws upon this constituent structure to comprehend wordless visual narratives. We recorded neural responses as participants viewed sequences of visual images (comics strips) in which blank images either disrupted individual narrative constituents or fell at natural constituent boundaries. A disruption of either the first or the second narrative constituent produced a left-lateralized anterior negativity effect between 500 and 700ms. Disruption of the second constituent also elicited a posteriorly-distributed positivity (P600) effect. These neural responses are similar to those associated with structural violations in language and music. These findings provide evidence that comprehenders use a narrative structure to comprehend visual sequences and that the brain engages similar neurocognitive mechanisms to build structure across multiple domains.
BackgroundPeople with schizophrenia process language in unusual ways, but the causes of these abnormalities are unclear. In particular, it has proven difficult to empirically disentangle explanations based on impairments in the top-down processing of higher-level information from those based on the bottom-up processing of lower-level information.MethodsTo distinguish these accounts, we used visual world eye-tracking, a paradigm that measures spoken language processing during real-world interactions. Participants listened to and then acted out syntactically ambiguous spoken instructions (e.g., “tickle the frog with the feather”, which could either specify how to tickle a frog, or which frog to tickle). We contrasted how 24 people with schizophrenia and 24 demographically-matched controls used two types of lower-level information (prosody and lexical representations) and two types of higher-level information (pragmatic and discourse-level representations) to resolve the ambiguous meanings of these instructions. Eye-tracking allowed us to assess how participants arrived at their interpretation in real time, while recordings of participants’ actions measured how they ultimately interpreted the instructions.ResultsWe found a striking dissociation in participants’ eye movements: the two groups were similarly adept at using lower-level information to immediately constrain their interpretations of the instructions, but only controls showed evidence of fast top-down use of higher-level information. People with schizophrenia, nonetheless, did eventually reach the same interpretations as controls.ConclusionsThese data suggest that language abnormalities in schizophrenia partially result from a failure to use higher-level information in a top-down fashion, to constrain the interpretation of language as it unfolds in real time.