דלג לתוכן (מקש קיצור 's')
אירועים

אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב

event speaker icon
רמי גריבונבאל (INRIA)
event date icon
יום שלישי, 19.10.2010, 14:30
event location icon
חדר 337, בניין טאוב למדעי המחשב
Should penalized least squares regression be interpreted as Maximum A Posteriori estimation? Penalized least squares regression is often used for signal denoising and inverse problems, and is commonly interpreted in a Bayesian framework as a Maximum A Posteriori (MAP) estimator, the penalty function being the negative logarithm of the prior. For example, the widely used quadratic program (with an $\ell^1$ penalty) associated to the LASSO / Basis Pursuit Denoising is very often considered as the MAP under a Laplacian prior in the context of additive white Gaussian noise (AWGN) reduction. The objective of this talk is to highlight the fact that, while this is {\em one} possible Bayesian interpretation, there can be other equally acceptable Bayesian interpretations. Therefore, solving a penalized least squares regression problem with penalty $\phi(x)$ need not be interpreted as assuming a prior $C\cdot \exp(-\phi(x))$ and using the MAP estimator. In particular, I will show that for {\em any} prior $p_X(x)$, the conditional mean estimator can be interpreted as a MAP with some prior $C \cdot \exp(-\phi(x))$. Vice-versa, for {\em certain} penalties $\phi(x)$, the solution of the penalized least squares problem is indeed the {\em conditional mean}, with a certain prior $p_X(x)$. In general we have $p_X(x) \neq C \cdot \exp(-\phi(x))$.

If time allows I will also discuss recent joint work with Volkan Cevher (EPFL) and Mike Davies (University of Edinburgh) on "compressible priors", in connection with sparse regularization in linear inverse problems such as compressed sensing. I will show in particular that Laplace distributed vectors cannot be considered as typically "compressible": they are not sufficiently well approximated by sparse vectors to be recovered from a low-dimensional random projection by, e.g., L1 minimization. This may be somewhat of a surprise considering that L1 minimization is associated to MAP estimation under the Laplace prior!