LREC26Turorial

Abstract

Understanding and representing meaning has long been a central challenge in Natural Language Processing (NLP). This tutorial explores how these approaches capture meaning. It offers a comprehensive overview of semantic representation, from the logical foundations of formal semantics to modern graph-based and hybrid frameworks.

Introduction

Understanding meaning lies at the core of NLP. Since the 1960s, semantic representation has evolved along two main lines: formal, logic-based models and graphical or data-driven approaches.

Logic-based approaches emerged from the desire to define semantics as model-theoretic truth conditions. Their lineage culminates in Richard Montague’s work, which showed how natural language could be mapped into formal logic using compositional rules. These approaches offer precision and compositionality but require significant formal expertise.

At the same time, graphical representations — such as John Sowa’s conceptual graphs — sought to make semantics more accessible, using nodes and edges to depict conceptual relations. While intuitive, these formalisms often struggle with expressing complex semantic phenomena such as quantification, scope, negation, and implicature.

In parallel, quantitative paradigms emerged, especially distributional semantics, representing words and sentences as high-dimensional vectors. Such models capture similarity but fail to express structured phenomena. The large language models (LLMs) further complicated the picture: they encode semantic regularities but in opaque, continuous spaces.

This tutorial revisits this rich landscape and introduces hybrid frameworks such as DRT, SDRT, RST, AMR, UMR, and YARN, which aim to balance expressivity, accessibility, and computational tractability. It also discusses how these representations inform model development and evaluation, and how the linearisation of graph structures remains an open challenge.

Resources

Outline

Part 1: What Semantics is? (or could be?)

This introduction outlines what semantics means from theoretical and computational perspectives. It sets the stage for understanding how meaning is represented, combined, and interpreted — from traditional linguistic theories to modern data-driven approaches. We will explore how formal models, lexical resources, and LLM each capture aspects of meaning, and what is at stake when these representations are used in NLP systems.

By contrasting symbolic and statistical perspectives, this session aims to highlight how semantics interacts with syntax, cognition, and pragmatics, and how these layers contribute to our understanding of meaning in context.

We will explore the key ideas and figures that shaped the field — from Frege’s compositionality to Montague’s integration of logic and syntax — and discuss how representational choices influence the balance between expressive power and computational feasibility.

Part 2: Graph-based and Intuitive Representations

This section presents visual and graph-based methods for representing meaning, making semantic structures clearer for humans and machines.
These representations bridge the gap between formal logic and intuitive understanding by depicting entities, relations, and events as interconnected nodes and edges.

We will explore how conceptual graphs, frame semantics, and semantic networks have shaped structured meaning representations, and discuss their limits in handling complex linguistic phenomena such as quantification, negation, and implicature.

The session will highlight the construction of multilingual annotated datasets involves collecting, aligning, and labeling data across languages to support cross-linguistic semantic analysis. it will also focus on ongoing efforts to bridge formal semantics and machine learning evaluation paradigms.

 

Technical Requirements

No specific technical requirements is needed.

Reading List

  • Amblard M., Pogodalla S., Modeling the Dynamic Effects of Discourse: Principles and Frameworks (2014).
  • Partee B.,Ter Meulen A. \& Wall R., Mathematical Methods in Linguistics (1990).
  • Jurafsky D., Martin J., Speech and Language Processing, 3rd edition (2024), Ch. 20–24.
  • Sowa J., Conceptual Structures: Information Processing in Mind and Machine (1984).
  • Abend, O., Rappoport, A., The state of the art in semantic representation (2017).
  • Asher N., Lascarides A., Logics of Conversation (2003).
  • Banarescu L., Bonial C., Cai S., Georgescu M., Griffitt K., Hermjakob U., Knight K., Koehn P., Palmer M.,
  • Schneider N., Abstract Meaning Representation for Sembanking (2013).
  • Bos, J., A survey of computational semantics: Representation, inference and knowledge in wide‐coverage text understanding (2011).
  • Kamp H., Reyle U., From Discourse to Logic (1993).
  • Pavlova S., Amblard M., Guillaume B., \textit{YARN is All You Knit (2024).

Presenters

Maxime Amblard is a Professor at the University of Lorraine and Director of the Master’s program in Natural Language Processing at IDMC. His research centers on formal and computational semantics, discourse representation, and the links between logic, language, and meaning construction. With extensive experience in modeling semantic phenomena and integrating formal representations into NLP systems, he contributes to advancing both theoretical and applied approaches to meaning.

His expertise covers theoretical linguistics, computational modeling, and the creation of resources for semantic annotation and interpretation. Author of numerous works on dynamic semantics, discourse structure, and logical models of language understanding, he actively teaches and supervises students in France and abroad. He also engages in interdisciplinary collaborations connecting linguistics, computer science, and cognitive science, promoting wider access to formal semantics within the research community.

 

Bruno Guillaume is a researcher at Inria, the French National Institute for Computer Science. He specialises in modelling the syntax and semantics of natural language. He has developed graph rewriting-based approaches for syntax-semantic interfaces and is the author of a book on the application of graph rewriting to natural language processing (NLP). He is the developper of tools for graph-based queries and transfomations of linguistic structures. He is involved in international initiatives such as Universal Dependencies and UniDive.

Bruno Guillaume has taught several courses on corpus construction and lexical resources to international master’s students. He has also taught an ESSLLI course on treebanking and corpus exploration.