# Stepwise Regression Code Revisited

I’ve added a few more tweaks to the stepwise regression code I published back in 2011. (If you wish, see here for the original post and here for a subsequent update.) The code does stepwise regression using F tests (or, equivalently, p-values of coefficients), which is a bit old fashioned but apparently how it is …

More

# Learning R (for data analysis and data science): Where to start

A friend of a friend (also in educational research) posted that he was interested in learning R. I had a couple of ideas but knew that others might have better ideas. So, I posted (on Twitter) looking for recommendations and received some excellent talks, links, and other resources. Here they are, in Tweet form (you may have to …

More

# An R package for sensitivity analysis (konfound)

knitr::opts_chunk\$set( comment = “#>”, collapse = TRUE ) With Ran Xu and Ken Frank, I have worked on a Shiny interactive web application for sensitivity analysis as well as an R package for carrying out sensitivity analysis using R. That R package is now available on CRAN! A link to the CRAN page for it is here and the website for the …

More

# What “R” qualitative research methods?

I recently stumbled upon a post on R-bloggers entitled “Qualitative Research in R.” This title got me pretty excited, since I’m generally excited about most things R and since I recently helped teach a qualitative methods course, which has had me thinking about adding more ethnographic and other qualitative elements to my work. I’d also heard of …

More

# Using MPlus from R with MPlusAutomation

According to the MPlus website, the R package MPlusAutomation serves three purposes: Creating related groups of models Running batches Extracting and tabulating model parameters and test statistics. Because modeling involves comparing related models, (partially) automating these is compelling. It can make it easier to use model results in subsequent analyses and can cut down on copy and pasting …

More

# A first pass at Latent Profile Analysis using MCLUST (in R)

Along with starting to use MPlus, I’ve become (more) interested in trying to find out how to carry out Latent Profile Analysis (LPA) in R, focused on two options: OpenMx and MCLUST. The two are very different: OpenMx is an option for general latent variable modeling (i.e., it can be used to specify a wide range of latent …

More

# Updated Stepwise Regression Function

Back in 2011, when I was still teaching, I cobbled together some R code to demonstrate stepwise regression using F-tests for variable significance. It was a bit unrefined, not intended for production work, and a few recent comments on that post raised some issues with it. So I’ve worked up a new and (slightly) improved …

More

# In what months are educational psychology jobs posted?

Division 15 of the American Psychological Association sponsors the Ed Psych Jobs website, which is an excellent resource for Ed Psych job seekers. I thought it would possibly be helpful to see when jobs were posted in the past in order to have a better idea about when jobs may be posted this year. Ed Psych Jobs, Robots …

More

# Comparing estimates and their standard errors from mixed effects and linear models

Some background One reason to use mixed effects models is that they help to account for data with a complex structure, such as multiple responses (to questions, for example) from the same people, students grouped into classes, and measures collected over time. Often, the way they account for these complex structures is in terms of …

More

# Using characteristics of rail-trails to predict how they are rated

Catching up I wrote a blog post (one that, to be honest, I liked a lot) on what the best rail-trails are in Michigan (here). A friend and colleague at MSU, Andy, noticed that paved trails seemed to be rated higher, and this as well as my cfriend and colleague Kristy’s comment about how we …

More

# Helpful resources for principal components analysis in R

I’m currently working on my dissertation proposal, which has meant exploring principal components analysis. I’ve worked with PCA before, but it’s been a couple of years, so I’m trying to refresh my memory, improve my understanding, and get this proposal moving! Along the way, I’ve found (and been recommended) some helpful resources that I thought …

More

# prcr update

The R package for person-oriented analysis (prcr) is updated (it’s now version 0.1.4). In particular, it was not clear how to use the profile assignments (i.e., what cluster each response is in) in subsequent analyses. So, the update now returns two different representations of the profile assignments, or which profile is associated with each observation: …

More

# Presentation on an Introduction to R for Data Analysis

I had an opportunity to present on an Introduction to R for Data Analysis to the School of Criminal Justice (at MSU). The presentation is organized into five sections: Background Wrangling, Plotting, and Modeling Essential Functionality Advanced Functionality Additional Resources A link to the presentation is here. Tweet

# Common Core and NGSS are not on the news

How often are curricular standards mentioned on TV news? With my friend Patrick, I was curious about using the newsflash package for something education-related. We came up with the idea of looking at mentions of the Common Core State Standards (for Mathematics and English Language Arts / Literacy) and the Next Generation Science Standards (for …

More

# The Internet Archive’s Television News Archive and Newsflash

The Internet Archive’s Television News Archive is a cool way to search closed captions from TV shows. Here’s a bit more information on it: The Internet Archive’s Television News Archive, GDELT’s Television Explorer allows you to keyword search the closed captioning streams of the Archive’s 6 years of American television news and explore macro-level trends …

More

# prcr: An R Package for Person-Centered Analysis

I’m excited to share that prcr (0.1.0), an R package for person-centered analysis, is now available on CRAN via install.packages(“prcr”). Person-centered analyses focus on clusters, or profiles, of observations, and their change over time or differences across factors. The package is designed to be “low threshold but high ceiling”, in that you can do all …

More

# R makes my blood boil and it’s Stack Exchanges fault

[Anna writes…] ​​I spend a lot of time on Stack Exchange. It’s an online forum where people ask questions about how to do statistics in the program R. Like Yahoo Answers for nerds. ​A visit to Stack Exchange is just about the only guaranteed way to ruin my day. ​When I have an R question …

More

# How much do we spend weekly on Groceries? Figuring out using R and Mint (Updated)

How much do we spend weekly on Groceries? Figuring out using R and Mint (Updated) We started using Mint to keep track of our spending. One of the best features of Mint is the ability to see past patterns of spending (and to use that information to not spend quite as much on, well, coffee, …

More

# Announcing clustRcompaR v.0.1.0

Announcing clustRcompaR v.0.1.0 Alex Lishinski and I worked on an R package over the last year or so. We are excited that it’s now available on CRAN. You can install the package using install.packages(‘clustRcompaR’) (only needed first time) and load it (more on its two functions below) using library(clustRcompaR). Here’s a description: Provides an interface …

More

# MIP Models in R with OMPR

A number of R libraries now exist for formulating and solving various types of mathematical programs in R (or formulating them in R and solving them with external solvers). For a comprehensive listing, see the Optimization and Mathematical Programming task view on CRAN. I decided to experiment with Dirk Schumacher’s OMPR package for R. OMPR …

More

# Interactive R Plots with GGPlot2 and Plotly

I refactored a recent Shiny project, using Hadley Wickham’s ggplot2 library to produce high quality plots. One particular feature the project requires is the ability to hover over a plot and get information about the nearest point (generally referred to as “hover text” or a “tool tip”). There are multiple ways to turn static ggplots …

More

# Formatting in a Shiny App

I’ve been updating a Shiny (web-based interactive R) application, during the course of which I needed to make a couple of cosmetic fixes. Both proved to be oddly difficult. Extensive use of Google (I think I melted one of their cloud servers) eventually turned up enough clues to get both done. I’m going to record …

More

# Some R Resources

(Should I have spelled the last word in the title “ResouRces” or “resouRces”? The R community has a bit of a fascination about capitalizing the letter “r” as often as possible.) Anyway, getting down to business, I thought I’d post links to a few resources related to the R statistical language/system/ecology that I think may …

More

# prcr: R package for person-centered analysis

I have been working on an R package for person-centered analysis. Here is a bit of background and information for how to install it (since it is very much in “beta”, it is only available via the method described below, and not through the typical (install.packages()) approach. Background prcr is an R package for person-centered …

More

# Accessing R Objects By Name

At a recent R user group meeting, the discussion at one point focused on two of the possibly lesser known (or lesser appreciated?) functions in the base package: get and assign. The former takes a string argument and fetches the object whose name is contained in the string. The latter does the opposite, assigning an …

More

# Coin Flipping

I don’t recall the details, but in a group conversation recently someone brought up the fact that if you flip a fair coin repeatedly until you encounter a particular pattern, the expected number of tosses needed to get HH is greater than the expected number to get HT (H and T denoting head and tail …

More

# Lessons learned when Web scraping #GorafiESR tweets

I’ve posted in the past about Web scraping Twitter user profiles, but I took some time last week to tackle something else that I’ve been thinking about: Scraping the tweets themselves. Web scraping tweets is a nifty trick, but it doesn’t necessarily have an obvious application right off the bat. I wound up doing it …

More

A couple of weeks ago, I spent some time discussing how to use R and Web scraping to retrieve information on Twitter users’ locations, as stored in their profiles. I’ve since updated the code to scrape not only locations, but names, descriptions, locations, personal websites, join dates, number of tweets, number of users following, number …

More

# Plotting Twitter users’ locations in R (part 4: the results)

After a week of going through the ins and outs of my experience with the #educattentats hashtag, it’s time to see some results! Yesterday, I discussed how I used the ggmaps package to turn the text string “locations” in users’ Twitter profiles into latitude and longitude coordinates. Once you have those, plotting them on a …

More

# Plotting Twitter users’ locations in R (part 3: turning text to coordinates)

Plotting Twitter users’ locations in R (part 3: turning text to coordinates) In yesterday’s post, I described how I scraped Twitter users’ profiles to collect the locations (as text strings) that they list in those profiles. This was a fantastic leap forward for my eventual goal of indicating on a map the location of everyone …

More

# Plotting Twitter users’ locations in R (part 2: geotags vs. Web scraping vs. API)

Yesterday, I mentioned discovering the French hashtag #educattentats that was created in the wake of the 13 November terrorist attacks. As far as I can tell, I discovered the hashtag shortly after it was created, so it’s been interesting to see how use of the hashtag has grown in the hours, days, and weeks since. …

More

# Plotting Twitter users’ locations in R (part 1: teachers, Twitter, and terrorism)

Yesterday, I posted for the first time in French, which was kind of exciting for me but less exciting for those of you who don’t read French. The good news is that yesterday’s post was, in effect, a prelude to a series of posts I’ll be doing this week (in English) about a French Twitter …

More

# Introduction to R

Alex Lishinski and I put together a presentation on an “Introduction to R” for the Educational Psychology and Educational Technology Student Research Group. Here is a link to the slides. Tweet

# Producing Reproducible R Code

A tip in the Google+ Statistics and R community led me to the reprex package for R. Quoting the author (Professor Jennifer Bryan, University of British Columbia), the purpose of reprex is to [r]ender reproducible example code to Markdown suitable for use in code-oriented websites, such as StackOverflow.com or GitHub. Much has been written about …

More

# More Shiny Hacks

In a previous entry, I posted code for hack I came up with to add vertical scrolling to the sidebar of a web-based application I’m developing in Shiny (using shinydashboard). Since then, I’ve bumped into two more issues, leading to two more hacks that I’ll describe here. First, I should point out that I’m using …

More

# Autocorrupt in R

You know that “autocomplete” feature on your smart phone or tablet that occasionally (or, in my case, frequently) turns into an “autocorrupt” feature? I just ran into it in an R script. I wrote a web-based application for a colleague that lets students upload data, run a regression, ponder various outputs and, if they wish, …

More

# Shiny Hack: Vertical Scrollbar

I bumped into a scrolling issue while writing a web-based application in Shiny, using the shinydashboard package. Actually, there were two separate problems. The browser apparently cannot discern page height. In Firefox and Chrome, this resulted in vertical scrollbars that could scroll well beyond the bottom of a page. That’s mildly odd, but not a …

More

# Tabulating Prediction Intervals in R

I just wrapped up (knock on wood!) a coding project using R and Shiny. (Shiny, while way cool, is incidental to this post.) It was a favor for a friend, something she intends to use teaching an online course. Two of the tasks, while fairly mundane, generated code that was just barely obscure enough to …

More

# Alternative Versions of R

Fair warning: most of this post is specific to Linux users, and in fact to users of Debian-based distributions (e.g., Debian, Ubuntu or Mint). The first section, however, may be of interest to R users on any platform. An alternative to “official” R By “official” R, I mean the version of R issued by the …

More

# Hell Hath No Fury Like a Dependency Scorned

There is a recent package for the R statistics system named “Radiant”, available either through the CRAN repository or from the author’s GitHub site. It runs R in the background and lets you poke at data in a browser interface. If you are looking for a way to present data to end-users and let them …

More

# Parsing Months in R

As part of a recent analytics project, I needed to convert strings containing (English) names of months to the corresponding cardinal values (1 for January, …, 12 for December). The strings came from a CSV file, and were translated by R to a factor when the file was read. The factor had more than 12 …

More

# Tips and Tricks for Getting Started with R

All statistical software have a learning curve, and compared to SPSS, R has taken me more time to learn the basics. However, since learning the basics, R seems easier to use than SPSS. Here are 10 tips and tricks (and some resources) I found helpful for getting started with R: Use RStudio, a separate interface that is installed …

More

# Learning R

I have recently dedicated myself to learning R, a programming language and environment for focusing largely on statistical analysis and computing. The benefit of using R over other statistical computing packages is that it is free, open-source, and has a hugely active community around its use.  R can be used cross-platform  (PCs, Macs, and Linux) …

More

# Histogram Abuse

Consider the following Trellis display of histograms of a response variable (Z), conditioned on two factors (X,Y) that can each be low, medium or high: The combined sample size for the nine plots is 10,000 observations. Would you agree with the following assessments? Z seems to be normally distributed for medium values of either X …

More

# Testing Regression Significance in R

I’ve come to like R quite a bit for statistical computing, even though as a language it can be rather quirky. (Case in point: The anova() function compares two or more models using analysis of variance; if you want to fit an ANOVA model, you need to use the aov() function.) I don’t use it …

More