Programming Language Breakdown for the HealthCare.gov Website

Written by: Randy Olson

Primary Source: Randal S. Olson

Late last year, the NY Times released an article quoting a specialist working on the HealthCare.gov web site:

According to one specialist, the Web site contains about 500 million lines of software code. By comparison, a large bank’s computer system is typically about one-fifth that size.

This astronomically large number became the subject of intense criticism over the following months, especially in the wake of HealthCare.gov’s initially failed launch. Particularly, a number of software engineering experts brought into question how realistic it is for any software engineering team to even produce a code base that large. Despite this, the 500 million lines of code statistic has been uncritically cited worldwide.

Just today, a data visualization poking fun at this statistic made it to the front page of the subreddit /r/dataisbeautiful. Apparently annoyed by this horrendously false statistic for the last time, one programmer on the HealthCare.gov software development team decided to put the statistic to rest. This programmer performed an automated code count for the HealthCare.gov code base and estimated that there it has only about 3.7 million lines of code for the primary code base. Below is the breakdown of programming languages for that 3.7 million lines of code.

healthcare-gov-code-count

The programmer clarified:

this doesn’t include parts of the system used for administrative tasks.

and

the total number of lines of code controlling the entire system could be anywhere from 5 – 15 million lines of code.

So there you go — as many of us guessed all along, the 500 million lines of code statistic was utterly bogus. Let’s share this information and put that bad statistic to rest.

The following two tabs change content below.
Randy Olson is a Computer Science graduate research assistant at Michigan State University in Dr. Chris Adami’s lab specializing in artificial intelligence, artificial life, and evolutionary computation. He runs a research blog where he writes about Python, scientific computing, evolution, and AI. Randy is an ardent advocate of open science and regularly travels the U.S. to teach researchers scientific computing skills at Software Carpentry workshops.