Written by: Stephen Hsu
Primary Source: Information Processing
Donoho was a pioneer in thinking about sparsity in high dimensional statistical problems. The motivation for this came from real world problems in geosciences (oil exploration), encountered in Texas when he was still a student. Geophysicists were using Compressed Sensing long before the rigorous mathematical basis was established.
The figure below, from the earlier post Compressed Sensing and Genomes, exhibits the Donoho-Tanner phase transition.
For more discussion of our recent paper The human genome as a compressed sensor, see this blog post by my collaborator Carson Chow and another on the machine learning blog Nuit Blanche. One of our main points in the paper is that the phase transition between the regimes of poor and good recovery of the L1 penalized algorithm (LASSO) is readily detectable, and that the scaling behavior of the phase boundary allows theoretical estimates for the necessary amount of data required for good performance at a given sparsity. Apparently, this reasoning has appeared before in the compressed sensing literature, and has been used to optimize hardware designs for sensors. In our case, the sensor is the human genome, and its statistical properties are fixed. Fortunately, we find that genotype matrices are in the same universality class as random matrices, which are good compressed sensors.
The black line in the figure below is the theoretical prediction (Donoho 2006) for the location of the phase boundary. The shading shows results from our simulations. The scale on the right is L2 (norm squared) error in the recovered effects vector compared to the actual effects.
From Donoho’s autobiographical sketch, provided for the Shaw Prize:
During 2004-2010, Jared Tanner and I discovered the precise tradeoff between sparsity and undersampling, showing when L1-minimization can work successfully with random measurements. Our work developed the combinatorial geometry of sparse solutions to underdetermined systems, a beautiful subject involving random high-dimensional polytopes. What my whole life I thought of privately as ‘non-classical’ mathematics was absorbed into classical high-dimensional convex geometry. [ Discussed at ~ 1:38 in the video. ]
More about John Tukey, Donoho’s undergraduate advisor at Princeton.