Computational model
The extent to which language processing involves prediction of upcoming inputs remains a question of ongoing debate. One important data point comes from DeLong et al. (2005) who reported that an N400-like event-related potential correlated with a probabilistic index of upcoming input. This result is often cited as evidence for gradient probabilistic prediction of form and/or semantics, prior to the bottom-up input becoming available. However, a recent multi-lab study reports a failure to find these effects (Nieuwland et al., 2017). We review the evidence from both studies, including differences in the design and analysis approach between them. Building on over a decade of research on prediction since DeLong et al. (2005)’s original study, we also begin to spell out the computational nature of predictive processes that one might expect to correlate with ERPs that are evoked by a functional element whose form is dependent on an upcoming predicted word. For paradigms with this type of design, we propose an index of anticipatory processing, Bayesian surprise, and apply it to the updating of semantic predictions. We motivate this index both theoretically and empirically. We show that, for studies of the type discussed here, Bayesian surprise can be closely approximated by another, more easily estimated information theoretic index, the surprisal (or Shannon information) of the input. We re-analyze the data from Nieuwland and colleagues using surprisal rather than raw probabilities as an index of prediction. We find that surprisal is gradiently correlated with the amplitude of the N400, even in the data shared by Nieuwland and colleagues. Taken together, our review suggests that the evidence from both studies is compatible with anticipatory semantic processing. We do, however, emphasize the need for future studies to further clarify the nature and degree of form prediction, as well as its neural signatures, during language comprehension.
When semantic information is activated by a context prior to new bottom-up input (i.e. when a word is predicted), semantic processing of that incoming word is typically facilitated, attenuating the amplitude of the N400 event related potential (ERP) – a direct neural measure of semantic processing. N400 modulation is observed even when the context is a single semantically related “prime” word. This so-called “N400 semantic priming effect” is sensitive to the probability of encountering a related prime-target pair within an experimental block, suggesting that participants may be adapting the strength of their predictions to the predictive validity of their broader experimental environment. We formalize this adaptation using a Bayesian learning model that estimates and updates the probability of encountering a related versus an unrelated prime-target pair on each successive trial. We found that our model’s trial-by-trial estimates of target word probability accounted for significant variance in the amplitude of the N400 evoked by target words. These findings suggest that Bayesian principles contribute to how comprehenders adapt their semantic predictions to the statistical structure of their broader environment, with implications for the functional significance of the N400 component and the predictive nature of language processing.
The N400 event-related brain potential is elicited by each word in a sentence and offers an important window into the mechanisms of real-time language comprehension. Since the 1980s, studies investigating the N400 have expanded our understanding of how bottom-up linguistic inputs interact with top-down contextual constraints. More recently, a growing body of computational modeling research has aimed to formalize theoretical accounts of the N400 to better understand the neural and functional basis of this component. Here, we provide a comprehensive review of this literature. We discuss “word-level” models that focus on the N400’s sensitivity to lexical factors and simple priming manipulations, as well as more recent sentence-level models that explain its sensitivity to broader context. We discuss each model’s insights and limitations in relation to a set of cognitive and biological constraints that have informed our understanding of language comprehension and the N400 over the past few decades. We then review a novel computational model of the N400 that is based on the principles of predictive coding, which can accurately simulate both word-level and sentence-level phenomena. In this predictive coding account, the N400 is conceptualized as the magnitude of lexico-semantic prediction error produced by incoming words during the process of inferring their meaning. Finally, we highlight important directions for future research, including a discussion of how these computational models can be expanded to explain language-related ERP effects outside the N400 time window, and variation in N400 modulation across different populations.