Surprising production: Developing probabilistic production models to enable the prediction of reading times based on readers' expectations in situated language comprehension

Research Seed Capital (RiSC) project, Baden-Württemberg Ministry of Science (MWK-BW) & young scientists funding excellence initiative, match RiSC, Federal Ministry of Education and Research (BMBF) as part of the Excellence Strategy of the German Federal and State Governments, University of Tübingen

August 2020 – December 2023

Summary

The proposed project combines two influential lines of research in computational psycholinguistics and experimental pragmatics in an innovative way in order to develop a processing model of semantico-pragmatic interpretation of contextually embedded language. These two lines are, on the one hand, expectation-based models of human language processing and, on the other hand, rational models of efficient and effective communication. The project investigates the quantitative relation between processing difficulty during reading and production probabilities of sentence continuations in visual contexts. The theoretical link between these two quantities is given by the so-called information content of a sentence continuation in context. The project goals will be achieved in three steps. In the first step, formal models of production probabilities will be developed and trained on the basis of empirical data. These models allow us to predict the probability with which speakers choose some sentence continuation in a given context to describe that context. In the proposed project, models of this kind will be developed for three carefully chosen test cases. In the second step, that test case which yielded the most promising model fit to the empirical data will be chosen to conduct a pilot reading time study in which the time needed to read the regions of a sentence will be measured. In the reading time study, the sentences will be presented following the same type of visual contexts that was also used to train the production models. The reading time measures thus obtained will be related to the predicted production probabilities in order to investigate the relationship between these two quantities. In the third step, the conclusions derived in this way will be refined further by extending the approach in a study where the ordering as well as the production probabilities of the individual words in a sentence will be manipulated. This third step is risky because it depends, among other things, on the successful realization of the first two. At the same time, it promises to be of special theoretical significance as it may potentially be one of the first steps towards a formal expectation-based model of incremental, word-by-word semantico-pragmatic interpretation. The results obtained from all the described studies will lay the foundation for a proposal for third-party funding and for a publication in a renowned scientific journal to be written in the final two phases of the project. The planned proposal for third-party funding is intended to continue and refine the approach just described.

People

Output

Writing

Master's theses

Talk

Poster