On June 1st 2021, GdR LIFT is organising a one-day online seminar on the interactions between formal and computational linguistics.
In particular, the position of symbolic methods in natural language processing systems and the contribution of computational methods to theoretical linguistics will be discussed.
The seminar is intended to make members of diverse scientific communities around the world meet and share their different perspectives.
It is free to attend the seminar and it will be held on Zoom and Gather.Town.
Please register as soon as possible using the following form so we can send you your login information: [here]
The times are in the Central European Summer Time zone (UTC+2).
- 10:30-11:30 Juan Luis Gastaldi, ETH Zürich: [tba]
- 11:30-12:30 Koji Mineshima, Keio University (18:30-19:30 UTC+9): [tba]
- 12:30-14:00 Lunch break & meetup Gather.Town
- 14:00-15:00 Maud Pironneau, Druide informatique (8:00-9:00 UTC-4): “Once Upon a Time, Linguists, Computer Scientists and Disruptive Technologies” [abstract]
- 15:00-16:00 Marie-Catherine de Marneffe, Ohio State University (9:00-10:00 UTC-4): [tba]
- 16:00-16:30 Meetup Gather.Town
- 16:30-17:30 Jacob Andreas, MIT (10:30-11:30 UTC-4): “Language models as world models” [abstract]
- 17:30-18:30 Olga Zamaraeva, University of Washington (8:30-9:30 UTC-7): “Assembling Syntax: Modeling wh-Questions in a Grammar Engineering Framework” [abstract]
- 18:30-19:30 Meetup Gather.Town
- Jacob Andreas (MIT, MA, USA)
Title: Language models as world models
Abstract: Neural language models, which place probability distributions over sequences of words, produce vector representations of words and sentences that are useful for language processing tasks as diverse as machine translation, question answering, and image captioning. These models’ usefulness is partially explained by the fact that their representations robustly encode lexical and syntactic information. But the extent to which language model training also induces representations of *meaning* remains a topic of ongoing debate. I will describe recent work showing that language models—trained on text alone, without any kind of grounded supervision—build structured meaning representations that are used to simulate entities and situations as they evolve over the course of a discourse. These representations can be linearly decoded into logical representations of world state (e.g. discourse representation structures). They can also be directly manipulated to produce predictable changes in generated output. Together, these results suggest that (some) highly structured aspects of meaning can be recovered by relatively unstructured models trained on corpus data.
- Juan Luis Gastaldi (ETH, Zürich, Switzerland)
- Marie-Catherine de Marneffe (Ohio State University, OH, USA)
- Koji Mineshima (Keio University, Tokyo, Japan)
- Maud Pironneau (Druide informatique, Québec, Canada)
Title: Once Upon a Time, Linguists, Computer Scientists and Disruptive Technologies
Abstract: At Druide informatique, we have been devising writing assistance software for over 25 years. We create writing text correctors, dictionaries, and guides, for everyone and every type of written document, available first in French, and more recently in English. As of 2021, more than 1 million people use Antidote, our flagship product. Consequently, we possess extensive experience in language technologies and we know how to make linguists and computer scientists work together. This knowledge can be seen as both historical and paradigm-shifting: historical in that Antidote for French was created back in 1993, at that time using symbolic rules; paradigm-shifting through the use of disruptive technologies and applications for different languages. Add to this complexity constant societal evolution, a dash of language politics, rational or not, and an inherent linguistic conservatism: now you have a portrait of the important themes in our work. This presentation will expose our successes as well as our failures across this field of possibilities.
- Olga Zamaraeva (University of Washington, WA, USA)
Title: Assembling Syntax: Modeling wh-Questions in a Grammar Engineering Framework
Abstract: Studying syntactic structure is one of the ways to learn about the range variation in human languages. But without computational aid, assembling the complex and fragmented hypotheses about different syntactic phenomena quickly becomes intractable. Fully explicit formalisms like HPSG allow us to encode our hypotheses about syntax and associated compositional semantics on the computer. We can then test these hypotheses rigorously, showing a clear area of their applicability, which can grow over time. In this talk, I will present my recent work on modeling the syntactic structure of constituent (wh-)questions for an HPSG-based grammar engineering framework called the Grammar Matrix. The Matrix includes implemented syntactic analyses which are automatically tested as a system on test suites from diverse languages. The framework helps speed up grammar development and is intended to make implemented grammar artifacts possible for many languages of the world, particularly for endangered languages. In computational linguistics, formalized syntactic representations produced by such grammars play a crucial role in creating annotations which are then used for evaluating NLP system performance. The grammars were also shown to be useful in applications such as grammar coaching, and advancing this line of research can contribute to educational and revitalization efforts.