Tuesday, 11.8.2020, 11:30
Zoom Lecture: https://technion.zoom.us/j/99647504267
Optical character recognition (OCR) systems performance have improved significantly in the deep learning era. This is especially true for handwritten text recognition (HTR ), where each author has a unique style, unlike printed text, where the variation is smaller by design. That said, deep learning based HTR is limited, as in every other task, by the number of training examples. Gathering data is a challenging and costly task, and even more so, the labeling task that follows, of which we focus here. One possible approach to reduce the burden of data annotation is semi-supervised learning. Semi supervised methods use, in addition to labeled data, some unlabeled samples to improve performance, compared to fully supervised ones. Consequently, such methods may adapt to unseen images during test time.
We present ScrabbleGAN, a semi-supervised approach to synthesize handwritten text images that are versatile both in style and lexicon. ScrabbleGAN relies on a novel generative model which can generate images of words with an arbitrary length. We show how to operate our approach in a semi-supervised manner, enjoying the aforementioned benefits such as performance boost over state of the art supervised HTR. Furthermore, our generator can manipulate the resulting text style. This allows us to change, for instance, whether the text is cursive, or how thin is the pen stroke.
Sharon is an applied computer vision scientist at Amazon Web Services, working on the AWS Rekognition API. She holds an MSc degree in Electrical Engineering from Tel-Aviv University and a BSc in Physics from the Hebrew University. Prior to her MSc, Sharon served for 6 years in the Air force as an Electronic Warfare researcher and team leader, after graduating from the Talpiot Program.