Events
The Taub Faculty of Computer Science Events and Talks
Neta Friedman (M.Sc. Thesis Seminar)
Thursday, 29.02.2024, 15:00
Advisor: Prof. Benny Kimelfeld
We study the problem of computing an embedding of the tuples of a relational database
in a manner that is extensible to dynamic changes of the database. In this problem,
the embedding should be stable in the sense that it should not change on the existing
tuples due to the embedding of newly inserted tuples (as database applications might
already rely on existing embeddings); at the same time, the embedding of all tuples, old
and new, should retain high quality. This task is challenging since inter-dependencies
among the embeddings of different entities are inherent in state-of-the-art embedding
techniques for structured data.
We study two approaches to solving the problem. The first is an adaptation of
Node2Vec to dynamic databases. The second is the FoRWaRD algorithm (Foreign
Key Random Walk Embeddings for Relational Databases) that draws from embedding
techniques for general graphs and knowledge graphs and is inherently utilizing the
schema and its key and foreign-key constraints. We evaluate the embedding algorithms
using a collection of downstream tasks of column prediction over geographical and
biological domains. We find that in the traditional static setting, our two embedding
methods achieve comparable results that are compatible with the state-of-the-art for
the specific applications. In the dynamic setting, we find that the FoRWaRD algorithm
generally outperforms and runs faster than the alternatives, and moreover, it features
only a mild reduction of quality even when the database consists of more than half of
newly inserted tuples after the initial training of the embedding.