New York Academy of Sciences - Machine Learning Symposium
October 27th, 7 World Trade Center, 10 am - 4 pm
SPECIAL PROMOTION: Event FREE for Members & Non-Members
http://www.nyas.org/machinelearning
Steering Committee
* Corinna Cortes, Ph.D. , Google, Inc.
* Haym Hirsh, Ph.D. , Rutgers University
* Tony Jebara, Ph.D., Columbia University
* Michael L. Littman, Ph.D., Rutgers University
* Mehryar Mohri, Ph.D., Courant Institute of Mathematical Sciences
* David Waltz, Ph.D., Columbia University
This is the first symposium on Machine Learning at the New York Academy
of Sciences. The New York Academy of Sciences launched a new initiative
in the physical sciences and engineering areas in 2006 and the Machine
Learning Symposium is a key event planned for the fall.
The primary goal of this symposium is to build a community of scientists
in machine learning from the NYC area's academic, government, and
industrial institutions by convening and promoting the exchange of ideas
in a neutral setting.
Program
09:30 a.m. Poster set-up & Continental Breakfast
10:00 a.m. Opening Remarks
10:15 a.m. Manfred Warmuth, University of California - Santa Cruz
11:00 a.m. Lise Getoor, University of Maryland
11:45 p.m. Spotlight Graduate Student Talks
01:30 p.m. Robert Schapire, Princeton University
02:15 p.m. Michael Kearns, University of Pennsylvania
03:00 p.m. Vladimir Vapnik, NEC-Columbia (tentative)
03:45 p.m. Student Award Winner Announcement & Closing Remarks
Invited Speaker Abstracts
Manfred K. Warmuth, University of California - Santa Cruz -
"Leaving the Span"
When linear models are too simple then the following ``kernel trick'' is
commonly used: Expand the instances into a high-dimensional feature
space and use any algorithm whose linear weight vector in feature space
is a linear combination of the expanded instances. Linear models in
feature space are typically non-linear in the original space and
seemingly more powerful. Also dot products can still be computed
efficiently via the use of a kernel function. However we discuss a
simple sparse linear problem that is hard to learn with any algorithm
that uses a linear combination of the embedded training instances as its
weight vector, no matter what embedding is used. We show that these
algorithms are inherently limited by the fact that after seeing k
instances only a weight space of dimension k can be spanned.
Surprisingly the same problem can be efficiently learned using the
exponentiated gradient (EG) algorithm: Now the component-wise logarithms
of the weights are essentially a linear combination of the training
instances. This algorithm enforces ``additional constraints'' on the
weights (all must be non-negative and sum to one) and in some cases
these constraints alone force the rank of the weight space to grow as
fast as 2^k. (Joint work with S.V.N. Vishwanathan)
Lise Getoor, University of Maryland -
"SRL: Statistical Relational Learning"
A key challenge for machine learning is mining richly structured
datasets describing objects, their properties, and links among the
objects. We'd like to be able to learn models which can capture both the
underlying uncertainty and the logical relationships in the domain.
Links among the objects may demonstrate certain patterns, which can be
helpful for many practical inference tasks and are usually hard to
capture with traditional statistical models. Recently there has been a
surge of interest in this area, fueled largely by interest in web and
hypertext mining, but also by interest in mining social networks,
security and law enforcement data, biological networks and
epidemiological records. Statistical Relational Learning (SRL) is a
newly emerging research area which attempts to represent, reason and
learn in domains with complex relational and rich probabilistic
structure. In this talk, I'll begin with a short SRL overview. Then,
I'll describe some of my group's recent work, including our work on
entity resolution in relational domains. Joint work with students:
Indrajit Bhattacharya, Mustafa Bilgic, Louis Licamele and Prithviraj
Sen.
Robert Schapire, Princeton University -
"Maximum Entropy and Species Distribution Modeling"
Modeling the geographic distribution of a plant or animal species is a
critical problem in conservation biology: to save a threatened species,
one first needs to know where it prefers to live, and what its
requirements are for survival. From a machine-learning perspective, this
is an especially challenging problem in which the learner is presented
with no negative examples and often only a tiny number of positive
examples. In this talk, I will describe the application of
maximum-entropy methods to this problem, a set of decades-old techniques
that happen to fit the problem very cleanly and effectively. I will
describe a version of maxent that we have shown enjoys strong
theoretical performance guarantees that enable it to perform effectively
even with a very large number of features. I will also describe some
extensive experimental tests of the method. This talk includes joint
work with Steven Phillips, Miroslav Dudik and Robert Anderson.
Michael Kearns, University of Pennsylvania -
"Regret to the Best vs. Regret to the Average"
Beginning at least as early as the 1950s, the long and still growing
literature on no-regret learning establishes the following type of
result: On any sequence of T trials in which the predictions of K
experts are observed, it is possible to maintain a dynamically weighted
prediction whose cumulative regret to the best single expert in
hindsight (that is, after the full sequence has been revealed) is only
O(\sqrt(T)), with absolutely no statistical assumptions on the data. The
algorithms achieving such remarkable guarantees generally agressively
reward winning experts and punish losing experts by updating weights in
a multiplicative rather than additive manner, in contrast to more
"traditional" learning algorithms such as the Perceptron algorithm. Is
there any price paid for such agressive updating?
In this work we simultaneously analyze an algorithm's regret to the best
expert and its regret to a more modest benchmark --- the average regret
of the experts, which can be achieved with no learning at all by simply
leaving the weights uniform. It would be natural to hope for algorithms
that can simultaneously achieve regret O(\sqrt(T)) to the best and
little or no regret to the average, thus providing a kind of "safety
net" on top of the traditional bounds. We describe the following
striking sequence of results:
* Any algorithm that achieves only O(\sqrt(T)) cumulative regret
to the best expert must, in the worst case, suffer regret
Omega(\sqrt(T)) to the average.
* For a wide class of algorithms, including standard
multiplicative update rules, the product of the cumulative regret to the
best and the cumulative regret to the average is Omega(T), and this
bound can be achieved by multiplicative updates.
* We give a new algorithm that, for any small constant c > 0,
achieves regret O(T^{1/2 + c}) to the best expert while suffering only
constant regret to the average (that is, with no dependence on T). This
algorithm works in phases that gradually move from more agressive but
still additive updates to multiplicative updates.
This talk describes joint work with Eyal Even-Dar, Yishay Mansour, and
Jenn Wortman.
POSTER ABSTRACTS
Phaedra Agius, Barry N. Kreiswirth, Steve Naidich, Kristin Bennett; New
York University Medical Center -
"Learning Stapphlococcus Aureus types using the spa gene and novel
distance measures"
Stuart Andrews, Tony Jebara; Columbia University -
"Predicting the edges of a network"
Suhrid Balakrishnan, David Madigan; Rutgers University -
"Decision Trees for Functional Variables"
Hila Becker, Marta Arias; Columbia University -
"An online Learning System for the Prediction of Electricity
Distrubution Feeder Failures"
Charles Bergeron, Kristin Bennett, Curt Breneman; Resnsselaer
Polytechnic Institute -
"Minimum noise fractions and kernel partial least squares applied to
small biological datasets: a winning combination"
Wei Chu, Columbia University; Vikas Sindhwani, University of Chicago;
Zoubin Ghahramani, University of Cambridge; S. Sathiya Keerthi, Yahoo!
Research -
" Relational Learning with Guassian Processes"
David Hammond, Courant Institute of Mathematical Sciences -
"A machine learning approach to adaptive combination of image denoising
methods"
Andrew G. Howard, Tony Jebara; Columbia University -
"Isotonic Support Vector Machines"
Bert Huang, Tony Jebara; Columbia University -
"Loopy Belief Propagation for Bipartite Maximum Weight B-Matching"
Jeffrey C. Jackson, Dusquene University; Homin K. Lee, Rocco A.
Servedio, Andrew Wan; Columbia University -
"Learning random monotone DNF"
Minyoung Kim, Vladimir Pavlovic; Rutgers University -
"Discriminative Mixture Models"
Jon R. Landers, Anshul Kundaje, Christina Leslie, Chris H. Wiggins;
Columbia University -
"Reverse Engineering via Ranking"
Lev Reyzin, Yale University; Robert E. Schapire, Princeton University -
"How Boosting the Margin Can also Boost Classifier Complexity"
Pannagadatta K. Shivaswamy; Columbia University -
"Robust Classification and Regression for Missing Values"
Alexander L. Strehl, Lihong Li, Rutgers University; Eric Wiewiora,
University of California, San Diego; John Langford, Yahoo! Research;
Michael L. Littman, Rutgers University -
"PAC Model-Free Reinforcement Learning"
Thomas J. Walsh, Carlos Diuk Wasser; Rutgers University -
"Using Classifiers to Transfer Knowledge"
Log In
<https://www.nyas.org/_membership/login.asp?redirect_location=http://www
.nyas.org/events/eventDetail.asp?eventID%3D7587%26date%3D10%2F27%2F2006%
252010%3A00%3A00%2520AM> to add this event to your calendar.
_______________________________________________
Colloquium mailing list
Colloquium@cs.columbia.edu
https://lists.cs.columbia.edu/cucslists/listinfo/colloquium
This archive was generated by hypermail 2b30 : Tue Jun 09 2009 - 05:00:13 EDT