Pixel Club: Multidimensional Image Representation and Processing Motivated by Human Vision

דובר:
שי פורמן (הנדסת חשמל, טכניון)
תאריך:
יום שלישי, 25.1.2011, 11:30
מקום:
חדר 1061, בניין מאייר, הפקולטה להנדסת חשמל

A biological model of visual information representation is adopted. Images are represented accordingly in a multidimensional space that incorporates the well investigated dimensions of intensity, color and spatio-temporal frequency. The model is extended to incorporate additional, less investigated dimensions such as curvature, size and depth (for example - from binocular disparity). Along these and other dimensions, that are yet to be discovered, the human visual system (HVS) enhances and emphasizes important image attributes by adaptation and nonlinear filtering.

The non-linear Automatic Gain Control (AGC) model of processing along the visual dimensions is presented together with its biological foundation. A biologically-motivated artificial neural network (ANN) implementation is presented as an example. The model is analyzed for its SNR characteristics. Several inputs and responses are considered and implemented along the visual dimensions of curvature, size and depth. The results are compared with those of psychophysical experiments, exhibiting good reproduction of visual illusions. Finally, examples of applications of the AGC model in image processing and computer vision are presented. These include HDR images, enhanced edge detection and curve completion due to occlusion.

Implementing the generic neural AGC model along all visual dimensions constitutes a universal, parsimonious and unified model that proposes how our visual system processes visual information along its various dimensions, before the later stage of sequential “visual routines” is implemented. This approach may lead to the development of a metric for calculation of distance between images, and facilitate the execution of important tasks, such as recognition and classification.

בחזרה לאינדקס האירועים