Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

Pixel Club: Viewpoint Estimation - Insights & Model
event speaker icon
Gilad Divon (EE, Technion)
event date icon
Wednesday, 11.04.2018, 11:30
event location icon
EE Meyer Building 1061
This thesis addresses the problem of viewpoint estimation of an object in a given image, where the objects belong to several known categories. Convolutional Neural Networks were recently applied to this problem, leading to large improvements of state-of-the-art results. Two major approaches have been pursued: a regression approach, which handles the continuous values of view points naturally, and a classification approach, which discretized the space of viewpoints. We follow the second approach and present five key insights that should be taken into consideration when designing a CNN that solves the problem. These insights regard all three components of any network: the architecture, the training data, and the loss function. Based on these insights, the thesis proposes a network in which (i) The architecture jointly solves detection, classification, and viewpoint estimation, using the most advanced CNN for performing the two former tasks. (ii) New types of data are added and trained on, in order to address the shortage in labeled data. Specifically, we propose to utilize both flipped images and video clips. (iii) A novel loss function, which takes into account both the geometry of the problem, as well as the new types of data, is propose. Our network improves the state-of-the-art results for this problem on PASCAL3D+ by 9.8%. The influence of each component is rigorously analyzed.

*MSc. student under the supervision of Prof. Ayellet Tal.