Skip to content (access key 's')
Logo of Technion
Logo of CS Department


Pixel Club: SCATTER: Selective Context Attentional Scene Text Recognizer
event speaker icon
Ron Litman (Amazon)
event date icon
Tuesday, 25.8.2020, 11:30
event location icon
Zoom Lecture:
Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes. In this paper, we introduce a novel architecture for STR, named Selective Context ATtentional Text Recognizer (SCATTER). SCATTER utilizes a stacked block architecture with intermediate supervision during training, that paves the way to successfully train a deep BiLSTM encoder, thus improving the encoding of contextual dependencies. Decoding is done using a two-step 1D attention mechanism. The first attention step re-weights visual features from a CNN backbone together with contextual features computed by a BiLSTM layer. The second attention step, similar to previous papers, treats the features as a sequence and attends to the intra-sequence relationships. Experiments show that the proposed approach surpasses SOTA performance on irregular text recognition benchmarks by 3.7% on average.

Short Bio:
I am a computer vision researcher in Amazon, focusing in tasks from the field of text recognition (scene text, OCR and handwriting recognition). Before Amazon I worked as a data scientist in a behavioral biometrics company. During my masters in Statistics, from Tel-Aviv university, my research field was focused on addressing survival analysis as a ranking problem, and solving it using techniques from similarity learning. Link to the paper -
[Back to the index of events]