Scientists of Stature

The link below is to the published version of the paper we posted on biorxiv in late 2017 (see blog discussion). Our results have since been replicated by several groups in academia and in Silicon Valley. Biorxiv article metrics: abstract views 31k, paper downloads 6k. Not bad! Perhaps that means the community understands now that genomic …

More

Risk, Uncertainty, and Heuristics

Risk = space of outcomes and probabilities are known. Uncertainty = probabilities not known, and even space of possibilities may not be known. Heuristic rules are contrasted with algorithms like maximization of expected utility. See also Bounded Cognition and Risk, Ambiguity, and Decision (Ellsberg). Here’s a well-known 2007 paper by Gigerenzer et al. Helping Doctors and …

More

A Shiny interactive web application to quantify how robust inferences are to potential sources of bias (sensitivity analysis)

We are happy to announce the release of an interactive web application, Konfound-It, to make it easy to quantify the conditions necessary to change an inference. For example, Konfound-It generates statements such as “XX% of the estimate would have to be due to bias to invalidate the inference” or “an omitted variable would have to …

More

A person-in-context approach to student engagement in science (article in JRST)

Over the past few years, I have worked with Jennifer Schmidt and Patrick Beymer to explore student engagement in science using the Experience Sampling Method (ESM). Most recently, we used what scholars have referred to as a “person-in-context” approach, using both ESM and a person-oriented approach. A figure is helpful for conveying how the person-oriented approach can be used to …

More

Using MPlus from R with MPlusAutomation

According to the MPlus website, the R package MPlusAutomation serves three purposes: Creating related groups of models Running batches Extracting and tabulating model parameters and test statistics. Because modeling involves comparing related models, (partially) automating these is compelling. It can make it easier to use model results in subsequent analyses and can cut down on copy and pasting …

More

Epistemic Caution and Climate Change

I have not, until recently, invested significant time in trying to understand climate modeling. These notes are primarily for my own use, however I welcome comments from readers who have studied this issue in more depth. I take a dim view of people who express strong opinions about complex phenomena without having understood the underlying …

More

prcr update

The R package for person-oriented analysis (prcr) is updated (it’s now version 0.1.4). In particular, it was not clear how to use the profile assignments (i.e., what cluster each response is in) in subsequent analyses. So, the update now returns two different representations of the profile assignments, or which profile is associated with each observation: …

More

History of Bayesian Neural Networks

This talk gives the history of neural networks in the framework of Bayesian inference. Deep learning is (so far) quite empirical in nature: things work, but we lack a good theoretical framework for understanding why or even how. The Bayesian approach offers some progress in these directions, and also toward quantifying prediction uncertainty. I was …

More

Announcing clustRcompaR v.0.1.0

Announcing clustRcompaR v.0.1.0 Alex Lishinski and I worked on an R package over the last year or so. We are excited that it’s now available on CRAN. You can install the package using install.packages(‘clustRcompaR’) (only needed first time) and load it (more on its two functions below) using library(clustRcompaR). Here’s a description: Provides an interface …

More

Can Life emerge spontaneously?

It would be nice if we knew where we came from. Sure, Darwin’s insight that we are the product of an ongoing process that creates new and meaningful solutions to surviving in complex and unpredictable environments is great and all. But it requires three sine qua non ingredients: inheritance, variation, and differential selection. Three does …

More

Speed, Balding, et al.: “for a wide range of traits, common SNPs tag a greater fraction of causal variation than is currently appreciated”

I recently blogged about a nice lecture by David Balding at the 2015 MLPM (Machine Learning for Personalized Medicine) Summer School: Machine Learning for Personalized Medicine: Heritability-based models for prediction of complex traits. In that talk he discussed some results concerning heritability estimation and potential improvements over GCTA. A new preprint on bioRxiv has the …

More

Some R Resources

(Should I have spelled the last word in the title “ResouRces” or “resouRces”? The R community has a bit of a fascination about capitalizing the letter “r” as often as possible.) Anyway, getting down to business, I thought I’d post links to a few resources related to the R statistical language/system/ecology that I think may …

More

Machine Learning for Personalized Medicine: Heritability-based models for prediction of complex traits (David Balding)

Highly recommended talk by David Balding on modern approaches to heritability, relatedness, etc. in statistical genetics. (I listened at 1.5x normal speed, which worked for me.) MLPM (Machine Learning for Personalized Medicine) Summer School 2015 Monday 21st of September Heritability-based models for prediction of complex traits by David Balding Complex trait genetics has been revolutionised …

More

Over- and Underfitting

I just read a nice post by Jean-François Puget, suitable for readers not terribly familiar with the subject, on overfitting in machine learning. I was going to leave a comment mentioning a couple of things, and then decided that with minimal padding I could make it long enough to be a blog post. I agree …

More