Why R is better than Excel for teaching statistics

This was the topic of a recent conversation on the Australian and New Zealand R mailing list. Here is an edited list of some of the comments made.

  • R is free.
  • R is well-documented.
  • R runs (really well) on *nix as well as Windows and Mac OS.
  • R is open-source. Trust in the R software is evident by its support among distinguished statisticians. However, the R user need not rely on trust, as the source code for R is freely available for public scrutiny.
  • R has a much broader range of statistical packages for doing specialist work.
  • R has an enthusiastic user base who can offer helpful advice for free.
  • R creates far better graphics than Excel.
  • R has certain data structures such as data frames that can make analysis more straightforward than in Excel
  • R is better for doing complex jobs
  • R is a better educational tool as it uses standard statistical vocabulary rather than home-baked terminology.
  • R is easier to learn, use, and script than Excel.
  • R allows students easily to work with scripts, thus allowing the work to be reproducible.
  • R is intended to lead students towards programming; Excel is designed to keep people away from programming and encourages them to rely on someone else doing their programming (and often their thinking) for them.
  • Excel is known to be inaccurate whereas R is thoroughly tested. For a critique of Excel, see McCullough & Heiser (2008).
  • The statistical package available in Excel is very limited in capability and should only be used by experienced applied statisticians who can work out when its output should be ignored.
  • While R takes a while to learn, it provides a broad range of possible analyses and does not constrain users to a very limited set of methods (as is the case for Excel).

Further comments on this theme are available at the following sites:


Related Posts:


  • Steve

    I think R is pretty difficult for people without programming experience. In fact, I still find it challenging in some ways, even though I’m a software developer and have used R for a few years. Regardless, it is my tool of choice for creating graphs and performing statistical analyses. I can’t imagine trying to do that in Excel. What about commercial stats programs like SAS? I’ve never used them but have wondered if they’re worth the investment.

  • Jason

    how about SAS’s accuracy? Is it trustworthy compared to R? Has anybody done the analysis?

  • Interesting overview.

    I somehow feel that if the other side (When/why is excel better then R) was to be presented here, I would feel better about it – but a good read nonetheless.

    Cheers,
    Tal

  • John

    Excel is superior when your statistical analysis consists of simple addition, subtraction, division or multiplication and your data fits on one or two spreadsheets. (^__^)

  • – Excel does not need documentation
    – 500,000 observations will crash R; 1,000,000 will crash Excel
    – Installing R packages is complicated; Excel plugins (including the Excel R plugin or data analysis pack) are easy to install
    – Excel is available on all Windows computers
    – R requires you to learn a programming language
    – You can share interactive Excel spreadsheets with top management; you can only share static R charts with top management
    – You can implement sophisticated analyses in Excel, such as hidden decision trees or constrained logistic regression (without using Macros / Cubes / VBA / Pivot tables), in a way that is easier than R

    • – Excel DOES need documentation. For example, there is no information provided about how it computes sample quantiles, what algorithm it uses for ill-conditioned regressions, etc. And the results from Excel do not match those from other software, so some explanation is required.
      – Really? Try x <- matrix(rnorm(1e7),1e6,10). Works fine for me. - Complicated? In R you have click the menu item "Packages/Install packages" and the click the package you want to install. Three mouse clicks isn't so hard is it? - R is available on all Windows computers too, plus *nix and Macs. - To do anything serious or non-standard in Excel you need some VBA, also a programming language. - I don't know how to do hidden decision trees or constrained logistic regression in Excel, so I can't comment on how easy it is. But there are packages for both in R that are easy to install and use. I'm sure there are good reasons to use Excel for some tasks, but you haven't provided them here.

    • Jason

      Excel does not need documentation? You gotta be kidding me.

    • 1/2th Asian-student

      Are you kidding? R is used for Genetic sequencing. If 500,000 observations crashes R, then you are using a pretty outdated computer without enough RAM. Of course, this is 6 years later, so the technology has improved. But we’re talking genomics, proteomics, metabolomics are all used by R. It’s the weapon of choice for a bioinformatician.

  • Pingback: Tweets that mention Why R is better than Excel for teaching statistics | Research tips -- Topsy.com()

  • You can make MBA students feel they ought to learn the software if you use Excel to teach; with anything else they respond poorly to learning an “exotic” tool they will “never use again”.

    That said, when was the last time there was an upgrade of the statistics capabilities in Excel? It’s such a money maker for Microsoft they could easily afford to improve the clunky, error-prone interface. Do they care?

  • I think the research world needs to take another look at Excel, and especially Excel 2010. I often run into general critics of Excel that are no longer valid given the robust character of Excel 2010, especially when supported with industry-standard add-in tools such as StatTools, XLStat, StatistiXL, and many others. Moreover, virtually all of the leading “standalone) analytics platforms such as MiniTab, SPSS, and JMP, all interface with Excel. For business statistics, Excel is the platform of choice and I’ll stake my reputation on that claim (http://www.mckibbinusa.com). Thank you for the opportunity to comment…

  • Dason Kurkiewicz

    It might be the case that Excel 2010 is better (I wouldn’t know) but the Microsoft group has consistently shown that they don’t care about any sort of quality in their statistical offerings. The argument of “Look! It’s FINALLY at an ok level!” doesn’t hold with me because R has consistently been a good product. Plus it’s free and open source.

    The general population might feel more comfortable with Excel but that’s really just because it’s the only thing they’ve ever used. Change isn’t bad.

  • Pingback: Why R is Better Than Excel for Fantasy Football (and most other) Data Analysis - Fantasy Football Analytics()

  • Pingback: Why R is Better Than Excel for Fantasy Football (and most other) Data Analysis | The Sports Risk Modeling Hub()

  • Pingback: Por qué R es mejor que Excel | Psicología de Datos()