Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

The Complexity of Relational Queries over Extractions from Text
event speaker icon
Liat Peterfreund (Ph.D. Thesis Seminar)
event date icon
Monday, 08.04.2019, 13:30
event location icon
Taub 601
event speaker icon
Advisor: Prof. Benny Kimelfeld
Information Extraction (IE) is the task of extracting structured information from textual data. We explore a programming paradigm that is supported by several IE systems where relations extracted by atomic extractors undergo a relational manipulation. In our efforts toward achieving a better understanding of IE systems, we study queries in the framework of document spanners that was introduced by Fagin et al. A spanner is a function that extracts from a document (string) a relation over text intervals, called spans, using either atomic extractors or a relational algebra query on top of these extractors. Evaluating IE queries is a computational problem that involves both the atomic extractors, the relational algebra expression and the document. We investigate the complexity of this problem from various angles, each keeping some components fixed and regarding the rest as input. In addition, we study the expressive power of IE queries that involve recursion ands the implications of supporting partial answers in the flavor of SPARQL.