דלג לתוכן (מקש קיצור 's')
אירועים

אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב

איך מניעת הטיות מגדריות במודלי עיבוד שפה טבעית משפיעה על ייצוגי מודל פנימיים, ולמה זה חשוב
event speaker icon
הדס אורגד (הרצאה סמינריונית למגיסטר)
event date icon
יום רביעי, 23.03.2022, 10:00
event location icon
Zoom Lecture: 98412403331
event speaker icon
מנחה: Dr. Yonatan Belinkov
Common studies of gender bias in natural language processing (NLP) focus either on extrinsic bias which is measured by model performance on a specific task or on intrinsic bias which is measured on a models' internal representations. However, the relationship between extrinsic and intrinsic bias is relatively unknown. In this work, we illuminate this relationship by measuring both quantities together: we debias a model during downstream fine-tuning, which reduces extrinsic bias, and measure the effect on intrinsic bias, which we measure with information-theoretic probing. Through experiments on two tasks and multiple bias metrics, we show that our intrinsic bias metric is a better indicator of debiasing than the standard metric, and can also expose cases of superficial debiasing. Our framework provides a comprehensive perspective on bias in NLP models, which can be applied to deploy NLP systems in a more informed manner.