Mitigating One-Sided Biases in Natural Language Understanding Datasets

Speaker:
Yonatan Belinkov - CS-Lecture
Date:
Sunday, 16.6.2019, 10:30
Place:
Room 601 Taub Bld.
Affiliation:
Harvard School of Engineering and Applied Sciences (SEAS)

Many natural language understanding (NLU) tasks consist of identifying the relationship between two objects, such as a paragraph and a question (reading comprehension), an image and a question (visual question answering), or a premise and a hypothesis (natural language inference). These tasks supposedly require a deep understanding of the information in the two objects and inference of the relationship between them. However, recent work has demonstrated that many NLU datasets contain one-sided biases—artifacts that allow models to achieve non-trivial performance by only considering one of the two objects. For instance, in natural language inference (NLI), models trained only on the hypothesis significantly outperform majority baselines, without learning whether a premise entails a hypothesis, and in visual question answering (VQA) many questions can be answered without looking at the image. This state of affairs poses a significant challenge to the NLU community: How should we handle such biases? In this talk, I will present a strategy for training models that are more robust to such biases and better transfer across datasets. The key idea is to encourage the models not to ignore the other object in the relationship (such as the premise in NLI). In practice, this results in an adversarial game between two subnetworks, one learning the full task and one the one-sided task. I will demonstrate the effects of this approach in the context of NLI and VQA, by analyzing the learned representations and evaluating the ability of the proposed models to generalize to other datasets. This talk is meant to be an informal event and will be followed by meetings with interested students. Short Bio: ============ Yonatan Belinkov is a Postdoctoral Fellow at the Harvard School of Engineering and Applied Sciences (SEAS) and the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research interests focus on interpretability and robustness in neural network models for natural language processing. He received PhD and SM degrees from MIT in 2018 and 2014, and prior to that a BSc in Mathematics and an MA in Arabic Studies, both from Tel Aviv University. He will be joining the Department of Computer Science in the Technion in Fall 2020.

Back to the index of events