אירועים
אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב
יום שלישי, 15.01.2019, 14:30
How can we capture effectively the information expressed in multiple texts? How can we
allow people, as well as computer applications, to easily explore it? The current
semantic NLP pipeline typically ends at the single sentence or text level, putting the
burden on applications to consolidate and present related information across multiple
texts. Further, semantic representations, which may provide the basis for text
consolidation, are often based on non-trivial schemata which require expert annotation,
making it a huge effort to create large scale corpora for training.
In this talk, I will outline a research program whose goals are to represent
consolidated information conveyed in multiple texts and to communicate it effectively
to users. This program builds upon three quite unexplored research lines. First, we aim
to establish a "natural" semantic representation for individual texts, which is based
solely on crowdsourcable natural language expressions rather than on pre-specified
schemata. To that end, we follow and extend the recent Question-Answer Semantic Role
Labeling (QA-SRL) approach, through which we decompose sentence information to
question-answer pairs, each representing an individual statement. Second, we are
developing approaches for consolidating information structures of different texts,
while requiring substantial extension of cross-text co-reference detection. The goal is
to yield a consolidated structure that may be seen as an "open" analogous to
traditional knowledge graphs, representing real-world elements and statements relating
them. Third, we are developing a framework for interactive exploration of multi-text
information, while addressing the challenging task of systematic and replicable
evaluation of such interactive methods. I will provide an overview of the framework and
its three research lines and illustrate different types of the evolving research tasks.
Short Bio:
Ido Dagan holds B.Sc. (Summa Cum Laude) and Ph.D. degrees in Computer Science from the
Technion, Israel. He conducted his Ph.D. research in collaboration with the IBM Haifa
Scientific Center, where he was a research fellow in 1991. During 1992-1994 he was a
Member of Technical Staff at AT&T Bell Laboratories. During 1994-1998 he has been at
the Department of Computer Science of Bar Ilan University, to which he returned in
2003. During 1998-2003 he was co-founder and CTO of a text categorization startup
company, FocusEngine, and VP of Technology at LingoMotors, a Cambridge Massachusetts
company which acquired FocusEngine.