David Donoho interview at HKUST

A long interview with Stanford professor David Donoho (academic web page) at the IAS at HKUST. Donoho was a pioneer in thinking about sparsity in high dimensional statistical problems. The motivation for this came from real world problems in geosciences (oil exploration), encountered in Texas when he was still a student. Geophysicists were using Compressed …

More

Regression Via Pseudoinverse

In my last post (OLS Oddities), I mentioned that OLS linear regression could be done with multicollinear data using the Moore-Penrose pseudoinverse. I want to tidy up one small loose end. Specifically, let be the matrix of predictor observations (including a column of ones if a constant term is desired), let be a vector of …

More

OLS Oddities

During a couple of the lectures in the Machine Learning MOOC offered by Prof. Andrew Ng of Stanford University, I came across two statements about ordinary least squares linear regression (henceforth OLS) that surprised me. Given that I taught regression for years, I was surprised that I could be surprised (meta-surprised?), but these two facts …

More

Producing Reproducible R Code

A tip in the Google+ Statistics and R community led me to the reprex package for R. Quoting the author (Professor Jennifer Bryan, University of British Columbia), the purpose of reprex is to [r]ender reproducible example code to Markdown suitable for use in code-oriented websites, such as StackOverflow.com or GitHub. Much has been written about …

More

Expert Prediction: hard and soft

Jason Zweig writes about Philip Tetlock’s Good Judgement Project below. See also Expert Predictions, Perils of Prediction, and this podcast talk by Tetlock. A quick summary: good amateurs (i.e., smart people who think probabilistically and are well read) typically perform as well as or better than area experts (e.g., PhDs in Social Science, History, Government; …

More

Colleges ranked by Nobel, Fields, Turing and National Academies output

Colleges ranked by Nobel, Fields, Turing and National Academies output This Quartz article describes Jonathan Wai’s research on the rate at which different universities produce alumni who make great contributions to science, technology, medicine, and mathematics. I think the most striking result is the range of outcomes: the top school outperforms good state flagships (R1 …

More

More Shiny Hacks

In a previous entry, I posted code for hack I came up with to add vertical scrolling to the sidebar of a web-based application I’m developing in Shiny (using shinydashboard). Since then, I’ve bumped into two more issues, leading to two more hacks that I’ll describe here. First, I should point out that I’m using …

More

One Hundred Years of Statistical Developments in Animal Breeding

This nice review gives a history of the last 100 years in statistical genetics as applied to animal breeding (via Andrew Gelman). One Hundred Years of Statistical Developments in Animal Breeding (Annu. Rev. Anim. Biosci. 2015. 3:19–56 DOI:10.1146/annurev-animal-022114-110733) Statistical methodology has played a key role in scientific animal breeding. Approximately one hundred years of statistical …

More

Sparsity estimates for complex traits

Note the estimate of few to ten thousand causal SNP variants, consistent with my estimates for height and cognitive ability. Sparsity (number of causal variants), along with heritability, determines the amount of data necessary to “solve” a specific trait. See Genetic architecture and predictive modeling of quantitative traits. T1D looks like it could be cracked …

More

Decision Analytics and Teacher Qualifications

Disclaimers: This a post about statistics versus decision analytics, not a prescription for improving the educational system in the United States (or anywhere else, for that matter). tl;dr. The genesis of today’s post is a blog entry I read on Spartan Ideas titled “Is Michigan Turning Away Good Teachers?” (Spartan Ideas is a “metablog”, curated …

More

IQ prediction from structural MRI

These authors use machine learning techniques to build sparse predictors based on grey/white matter volumes of specific regions. Correlations obtained are ~ 0.7 (see figure). I predict that genomic estimators of this kind will be available once ~ 1 million genomes and cognitive scores are available for analysis. See also Myths, Sisyphus and g. MRI-Based …

More

The Monty Hall Evolver

The Monty Hall problem is very famous (Wikipedia, NYT). It is so famous because it so easily fools almost everyone the first time they hear about it, including people with doctorate degrees in various STEM fields. There are three doors. Behind one is a big prize, a car, and behind the two others are goats. …

More

Income, wealth, and IQ

I’m occasionally asked about financial returns to cognitive ability. As a rough rule of thumb, judging from the graphs below (obtained here), I would say: On average, an increase of IQ by one SD corresponds to  ~ $30k per annum of additional income. (Somewhat less than 1 SD in income; the distribution is far from …

More

Rigorous inequalities

  The Effects of an Anti-grade-Inflation Policy at Wellesley College Journal of Economic Perspectives, 28(3): 189-204 (2014) DOI: 10.1257/jep.28.3.189 Average grades in colleges and universities have risen markedly since the 1960s. Critics express concern that grade inflation erodes incentives for students to learn; gives students, employers, and graduate schools poor information on absolute and relative …

More

Top 25 richest living comedians

It’s fairly common knowledge that comedy isn’t a terribly lucrative career. Not only do most comedians spend decades doing small-time standup hoping to be discovered, but most of those comedians never end up being discovered either. But what about the comedians that did hit it big? To provide some insight into what it takes to …

More

CBO Against Piketty?

This report using CBO  (Congressional Budget Office) data claims that income inequality did not widen during the Great Recession (table above compares 2007 to 2011). After government transfer payments (taxes, entitlements, etc.) are taken into account, one finds that low income groups were cushioned, while high earners saw significant declines in income. … The CBO on …

More

Python usage survey 2014

Remember that Python usage survey that went around the interwebs late last year? Well, the results are finally out and I’ve visualized them below for your perusal. This survey has been running for two years now (2013-2014), so where we have data for both years, I’ve charted the results so we can see the changes …

More

Venture capital in the 1980s

Via Dominic Cummings (@odysseanproject), this long discussion of the history of venture capital, which emphasizes the now largely forgotten 1980s. VC in most parts of the developed world, even large parts of the US, resembles the distant past of the above chart. There is a big gap between Silicon Valley and the rest. Heat Death: …

More

Gender trouble in the valley

This NYTimes article looks at the gender disparity in technology career success within the Stanford class of 1994. NYTimes: In the history of American higher education, it is hard to top the luck and timing of the Stanford class of 1994, whose members arrived on campus barely aware of what an email was, and yet …

More

The most-viewed YouTube videos

Earlier this week, Google announced that Psy’s insanely viral YouTube music video, Gangnam Style, officially broke the 2,147,483,647 view barrier. What’s significant about that number, you ask? That’s the largest number that can be encoded by a 32-bit integer number. The folks at Google never suspected a video would exceed 2 billion views until Gangnam …

More

A Clever Control Supporting Claims that Exercise Improve Cognition

Many studies over the years have found that aerobic exercise improves cognitive performance. The following are just a sample of those published in the last 12 months: “Mind racing: The influence of exercise on long-term memory consolidation“, published in Memory, found that aerobic exercise improved the learning of procedures or written text whether the subjects …

More

20 years @15 percent: does Harvard discriminate against Asian-Americans?

  This is the brief filed yesterday against Harvard, alleging discrimination against Asian-American applicants. A related suit was filed against UNC, with perhaps another to come against Wisconsin. Re: the graph above, note that Caltech has race-blind admissions. … Harvard is engaging in racial balancing. Over an extended period, Harvard’s admission and enrollment figures for …

More