R makes my blood boil and it’s Stack Exchanges fault

Written by: Anna Groves

Primary Source: Plant//People

[Anna writes…] ​​I spend a lot of time on Stack Exchange. It’s an online forum where people ask questions about how to do statistics in the program R. Like Yahoo Answers for nerds.

​A visit to Stack Exchange is just about the only guaranteed way to ruin my day.

​When I have an R question and am perusing Stack Exchange, I am always left appalled at the rudeness and lack of empathy of the posters, further confused by their suggestions, and almost always still wanting the answer to my original question. Yes, I understand that often the question-ers haven’t given enough information, and that this must be terribly frustrating to the regular answer-ers, but come on. I usually find my exact question already posted, but with comments that are mostly “omg I can’t believe you didn’t give this information I can’t possibly help you even a little bit” and never “here’s a hint but I can help more if you tell me more.”R people will be thinking right about now, “but every situation is unique, so we need 100% of the information before we can give perfect advice so that the data analysis is perfect.”

I challenge that these people need to be more flexible and willing to answer the damn questioneven if the person asking it might be making mistakes otherwise. Is that too much to ask?

It’s okay to still correct the mistakes! But I get the feeling no one has considered that maybe the other people coming to the page actually want the question answered.

For instance, if someone posted a question like: “what’s the R code for a ANOVA?”

For which an appropriate answer might be: “aov(y ~ x)”

I bet you a million dollars that there would be 10 comments like:

I can’t possibly be expected to give you statistical advice when not provided with the details of your experimental design, hypotheses, and life story. You have written “a anova” which is grammatically incorrect. When you use an indefinite article before a word starting with a vowel, you absolutely must use ‘an’ rather than ‘a.’ Please use proper grammar when posting on this forum. The analysis of variance is only appropriate under certain circumstances. What are your degrees of freedom? Please fax me your research proposal. You’re clearly an idiot.

Too extreme?Then why does my blood boil whenever I spend more than about 5 minutes reading these threads?

For instance, I recently had some trouble with a mixed effect model. I’ll skip the details, but a few colleagues suggested that I try a non-parametric model.

I couldn’t remember what a non-parametric framework even looked like in R, but assumed it would be easily Google-able. I searched for something like “mixed effect non-parametric model R” and found a Stack Exchange page that seemed perfect.

Posted 2 years ago: “Is there any model that includes random effects with non-parametric data distribution?

Bingo. This is literally a yes or no question that I wanted to know the answer to.

I went to the page, and the question was very well set up (to me). It said:

I have a non-parametric (by which I mean non-normal) data distribution. I tried several transformations, but none were helpful. Now, I want to find a model where I can include random effects with the non-normally distributed data. I know the Kruskal-Wallis test, but I couldn’t find any hints if I can include random effects. Does anyone know of an appropriate model?

I’m thinking, jackpot.

The poster goes on to describe their dataset and even includes some histograms to show their data distribution and some code.

Bring on the answers to my question.

Comment 1. If you give more information about your data (and research question or hypotheses), someone might point out parametric alternatives (such as a GLMM). That you have non-normal residuals (the distribution of data is not relevant here) doesn’t mean that you must use non-parametric methods.

​Um, okay. It’s good advice… unfortunately the question I wanted answered was whether there is or is not a non-parametric option for a mixed effect model.Comment 2. Please say more about your data. I do not believe that “non-parametric data distribution” actually means anything in statistics. Also note that the data do not have to be normally distributed for a standard model, only the residuals should be, but even those can be somewhat non-normal if you have enough data. Please clarify your situation, your data & your goals more fully.

Let’s remember that the original post said “I have a non-parametric (by which I mean non-normal) data distribution.” Although it is true that saying “non-parametric data” is incorrect, I think it is very clear that this person actually meant “non-normal.” Because they said that. It’s one thing to answer the question and gently correct the error. It’s another to point out the error and then disregard the question because of it. But okay… let’s keep going. Comments have been noted. Still no word about whether there’s a non-parametric option for a mixed effect model.

Next the original poster responded with more description of their data, as requested.

Now for comment 3. Data is data. The adjectives ‘parametric’ and ‘nonparametric’ apply to models or methodologies, not to data. Do you simply mean ‘not normal’ when you say ‘nonparametric’, or do you intend to imply something more than that? Can you describe your data in more detail? Is it Likert scale, for example? How are you assessing normality? With images, upload them somewhere (say, imgur.com, which stackexchange uses), and give us a link in your post, and someone will fix it for you so we can see your image(s) in your post.

It’s around this point when my blood begins to boil with a silent rage. The poster has already been chastised for saying “non-parametric data” and it’s already been clarified what they meant. Thank you for correcting them, again.  And not answering the question, again. And what the fuck is a Likert scale? You really think that someone who said “non-parametric data” knows if their data is “Likert scale”?​

The user then responded to all the requests from commenter 3 (including uploading stuff to imgur, etc… which was probably a lot of work.)There are no further comments.

This page has been viewed 1477 times and not one of us found out if you can do a non-parametric mixed model in R. Nor does the original poster know how to analyze their data (which, in defense of the commenters, probably didn’t need a non-parametric model anyway BUT STILL.)

Please, people responding to questions on Stack Exchange, would it kill you to answer a question even if the person is wrong to be asking it?

And could you be less of a dick?

The following two tabs change content below.
Anna Groves
Anna is a Ph.D candidate in Lars Brudvig's lab in MSU's Department of Plant Biology and Ecology, Evolutionary Biology, and Behavior program. She studies prairie restoration ecology while aspiring to build a career sharing science with others.
Anna Groves

Latest posts by Anna Groves (see all)