Pegasos

This week we will discuss the paper of Shai Shalev-Shwartz et al:
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

The paper presents an iterative algorithm for solving the optimization problem of Support Vector Machines (SVM). The authors show that the number of iterations needed to converge to a solution with ε accuracy is O(1/ε).  In each iteration the algorithm optimizes on a single sample or a subset of samples (mini-batch setting). The flavor of the algorithm is of stochastic gradient descent (for the single sample case it is a stochastic sub-gradient decent algorithm). As a result, the convergence rate is not directly related to the number of samples. This property makes the algorithm attractive for large scale problems.

If you don’t have much time this week, there is also a shorter ICML version available . I recommend the longer version, it shows a slight generalization of this algorithm, and is generally less squeezed ;).

If you think you’ll be able to make it, please sign in.
Note that the discussion time slot has moved to Wednesday morning at 9:30.

Welcome!

Hi all,

With the opening of our new ML reading group I’ve decided to open this blog that will keep you up to date with the group’s upcoming discussions and archives.

Before each meetup, hopefully a full week in advance, I will post the subject for the reading group with a link to the reading material. If you think you can make it this week please register. Walk-ins are also welcomed, but registration will allow us to provide better hospitality (size of room and snacks for all 🙂 ).

This site is also a place for holding prior/post meeting discussion on the paper, so comment away…
If you have any suggestions for future subjects, please comment this post.

Best,
Maayan Harel

A unified framework for high-dimensional analysis

The subject of this week’s group will be the paper –
 “A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers”

Link to paper

The formulation of an estimation problem as a regularized optimization problem is useful and familiar in ML. This formulation combines two components, a loss function that measures how well the model fits the data and some regularization function (e.g. Lasso, group Lasso, ridge regression, PCA, matrix completion etc).
The authors of this paper present a general approach for constructing consistency and convergence rates for such regularized estimators in the high-dimensional setting.
This means, when the number of parameters is in the same scale as the number of samples!
The reading may require a bit of effort, but the diligent reader will be rewarded 🙂

Register to this week’s group