אירועים
אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב
Abraham (Adi) Wyner - SPECIAL GUEST LECTURE - TECHNION MACHINE LEARNING SEMINAR
יום שני, 13.11.2017, 12:30
There is a large literature explaining why AdaBoost is a successful
classifier. The literature on AdaBoost focuses on classifier margins and
boosting's interpretation as the optimization of an exponential
likelihood function. These existing explanations, however, have been
pointed out to be incomplete. A random forest is another popular
ensemble method for which there is substantially less explanation in the
literature. We introduce a novel perspective on AdaBoost and random
forests that proposes that the two algorithms work for essentially
similar reasons. While both classifiers achieve similar predictive
accuracy, random forests cannot be conceived as a direct optimization
procedure. Rather, random forests is a self-averaging, interpolating
algorithm which fits training data without error but is nevertheless
somewhat smooth. We show that AdaBoost has the same property. We
conjecture that both AdaBoost and random forests succeed because of
this mechanism. We provide a number of examples and some theoretical
justification to support this explanation. In the process, we question
the conventional wisdom that suggests that boosting algorithms for
classification require regularization or early stopping and should be
limited to low complexity classes of learners, such as decision stumps.
We conclude that boosting should be used like random forests: with large
decision trees and without direct regularization or early stopping.
Speaker (Brief) Bio:
Abraham (Adi) Wyner is Professor and Chair of the Undergrad Program in
Statistics at the University of Pennsylvania's Wharton School. Before
arriving at University of Pennsylvania in 1999, he was Assistant
Professor of Statistics at University of California, Berkeley. His
research is in machine learning, discrete time series, Information
Theory, and the application of Statistics to Environmental Sciences,
Neuroscience, Information Theory and Sports.