Dimension Reduction | Machine Learning Reading Group

In the last few weeks we covered a few interesting subjects in dimensionality reduction. I would like to give a recap of the papers and give some pointers to further reading.

In the first meet we covered a comparative review mostly on non linear methods for dimension reduction (manifold learning) – http://www.iai.uni-bonn.de/~jz/dimensionality_reduction_a_comparative_review.pdf

Next, we dived into the details of one common manifold learning method – Diffusion Maps. Papers: The original paper by Coifman (link) and an extension paper.

Last, we talked about Random Projections. We looked at several different sources:

For some general theory see link. It gives the basic theorem and proofs and an application to fast computation of truncated PCA (which is highly relevant to our the global methods of dimensionality reduction which use it to place points given a distance matrix).
For some practice see: “Random projection in dimensionality reduction: Applications to image and text data”. The paper shows how distances between images are better preserved by random projections than by PCA (note PCA isn’t supposed to preserve distances, exactly). In particular it gives a feel for how these methods sometimes work at much more reasonable dimensions than basic theory predicts.
A little more of both: “Experiments with Random Projection” Dasgupta (Sections 3.2, 4.3). Shows a nice extra lemma about eccentricity of Gaussians under random projections, and an application to simple classification of the MNIST drawn digits data-set.

For some more reading on random projections, suggested by Odalric:

Random projections in ML:

Linear regression with Random Projections: http://jmlr.csail.mit.edu/papers/volume13/maillard12a/maillard12a.pdf
Random Projections trees: Sanjoy Dasgupta and Yoav Freund. Random projection trees and low dimensional manifolds. In Proceedings Of The 40th Annual ACM Symposium On Theory Of Computing, STOC ’08, pages537–546, New York, NY, USA, 2008. ACM. (+ dasgupta”s webpage)
Spectral clustering: Bin Zhao and Changshui Zhang. Compressed spectral clustering. InProceedings Of The 2009 IEEE International Conference On Data Mining Workshops, ICDMW ’09, pages 344–349, Washington, DC, USA, 2009. IEEE Computer Society.

For applications in signal processing:

Sparse recovery with Brownian sensing: http://books.nips.cc/papers/files/nips24/NIPS2011_1005.pdf
Compressive sensing: http://www-m15.ma.tum.de/foswiki/pub/M15/Allgemeines/OldNews/CSFornasierRauhut.pdf

More topics on random matrices, some examples:

Random projection with sparse matrices: Dimitris Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binarycoins.Journal Of Computer And System Sciences, 66(4):671–687, June 2003
The RIP property: R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin. A simple proof of the restricted isometry property for random matrices. Constructive Approximation,28(3):253–263,2008.
Compressed Sensing and high-dimensional geometry: http://www-personal.umich.edu/~romanv/teaching/2012-13/709/CS-random-matrices.pdf
The notion of incoherence: E. Candes and J. Romberg. Sparsity and incoherence in compressive sampling.Inverse Problems,23:969–985,2007.
Singular values: http://djalil.chafai.net/Docs/sing.pdf
Fast Random projections (e.g.): https://web.math.princeton.edu/~amits/publications/LeanWalsh_published.pdf