Machine Learning Seminar

Fall 2014

• Mehryar Mohri - NYU Courant
• Date: 09/16/14
• Title: Deep Boosting
• Abstract:
We present a new ensemble learning algorith, DeepBoost, which can use as base classifiers hypothesis set containing deep decision trees, or members of other rich complex families, and succeed in achieving high accuracy without overfitting the data. The key to the success of the algorithm is a capacity-concious criterion for the selection of the hypotheses. We give new data-dependent learning bounds for convex ensembles expressed in terms of the Rademacher complexities of the sub-families composing the base classifier set, and the mixture weight assigned to each sub-family. Our algorithm directly benefits from these guarantees since it seeks to minimize the corresponding learning bound. We give a full description of our algorithm, including the details of its derivation, and report the results of several experiments showing that its performance compares favorably to that of AdaBoost and Logistic Regression and their L_1-regularized variants.
• Levent Sagun -NYU Courant
• Felix Yu Columbia University
• Fall recess
• Date: 10/14/14
• Andres Munoz Medina - NYU Courant
• Date: 10/21/14 and 10/28/14
• Title: An introduction to Bandits
• Abstract
I will present a tutorial on bandit algorithms. First, I will discuss the solutions for the original k-bandit problem. On the second session I will talk about contextual bandits and their connections with supervised learning.
• Vitaly Kuznetsov - NYU Courant
• Giulia DeSalvo - NYU Courant
• Scott Yang -NYU Courant

Spring 2014

• Andres Munoz Medina - NYU Courant.
• Date: 02/04/14. Postponed: 02/11/14
• Title: Learning Theory and Algorithms for Revenue Optimization in Second-Price Auctions with reserve.
• Abstract:
Second-price auctions with reserve play a critical role in the revenue of modern search engine and popular online sites since the revenue of these companies often directly depends on the outcome of such auctions. The choice of the reserve price is the main mechanism through which the auction revenue can be influenced in these electronic markets. We cast the problem of selecting the reserve price to optimize revenue as a learning problem and present a full theoretical analysis dealing with the complex properties of the corresponding loss function. We further give novel algorithms for solving this problem and report the results of several experiments demonstrating their effectiveness.
• Slides
• Liang Huang - CUNY
• Date: 02/18/14
• Title: Latent Variable Max-Violation Perceptron for Scalable Training in Machine Translation.
• Abstract:
While large-scale discriminative training has triumphed in many NLP problems, its definite success on machine translation has been largely elusive. Most recent efforts along this line are not scalable (training on the small dev set with features from top ∼100 most frequent words) and overly complicated. We instead present a very simple yet theoretically motivated approach by extending the recent framework of “violation-fixing perceptron” to handle structured latent variables which correspond to MT derivations. Extensive phrase-based translation experiments on both Chinese-to-English and Spanish-to-English tasks show substantial gains in BLEU by up to +2.3/+2.0 on dev/test over MERT, thanks to 20M+ sparse features. This is the first successful effort of large-scale online discriminative training for MT.
• Yoni Halpern - NYU.
• Date: 02/25/14
• Title:TBD.
• Date: 03/04/14.
• Title: Learning binary representations for fast similarity search in massive databases.
• Abstract:
Binary coding based Approximate Nearest Neighbor (ANN) search in huge databases has attracted much attention recently due to its fast query time and drastically reduced storage needs. There are several challenges in developing a good ANN search system. A fundamental question that comes up often is: how difficult is ANN search in a given dataset? In other words, which data properties affect the quality of ANN search and how? Moreover, for different application scenarios, different types of learning methods are appropriate. In this talk, I will discuss what makes ANN search difficult, and a variety of binary coding techniques for non-negative data, data that lives on a manifold, and matrix data.
• Maria Florina Balcan
• Date: 03/11/14.
• Title: Foundations for Learning in the Age of Big Data.
• Abstract:
With the variety of applications of machine learning across science, engineering, and computing in the age of Big Data, re-examining the underlying foundations of the field has become imperative. In this talk, I will describe new models and algorithms for important emerging paradigms, specifically, interactive learning and distributed learning. For active learning, where the algorithm itself can ask for labels of carefully chosen examples from a large pool of unannotated data with the goal of minimizing human labeling effort, I will present results giving computationally efficient, optimal label complexity algorithms. I will also discuss learning with more general forms of interaction, as well as unexpected implications of these results for classic supervised learning paradigms. For distributed learning, I will discuss a model that for the first time addresses the core question of what are the fundamental communication requirements for achieving accurate learning. Broadly, we consider a framework where massive amounts of data is distributed among several locations, and our goal is to learn a low-error predictor with respect to the overall distribution of data using as little communication, and as few rounds of interaction, as possible. We provide general upper and lower bounds on the amount of communication needed to learn a given class, as well as broadly-applicable techniques for achieving communication-efficient learning.
• Slides
• Spring Recess: No scheduled talk
• Date: 03/18/14.
• Vitaly Kuznestov - NYU Courant.
• Date: 03/25/14.
• Title: TBD.
• Rachel Hodos - NYU Courant.
• Date: 04/15/14.
• Title: Metric Learning for Drug Repurposing.
• Abstract:
Sometimes standard distance (or similarity) metrics to compare data points are not well suited for a particular application. For example, while standard Euclidean distance treats each feature on equal footing, certain features may have more importance than others, e.g. in image classification. Metric learning is a field of machine learning that is aimed at learning more appropriate metrics for a given domain and task. I will give an introduction and review of metric learning, primarily based on the attached article by Kulis. I will focus on global, linear metric learning but will also discuss extensions to nonlinear and local metric learning techniques. I will cover mathematical formulations, as well as specific algorithms and applications.
I'll describe a new algorithm for the contextual bandit learning problem, where the learner repeatedly takes an action in response to the observed context, observing the reward only for that action. The method assumes access to an oracle for solving cost-sensitive classification problems and achieves the statistically optimal regret guarantee with only $\tilde{O}(\sqrt{T})$ oracle calls across all $T$ rounds. Joint work with Alekh Agarwal, Satyen Kale, John Langford, Lihong Li, and Rob Schapire.