A blog by Rob J Hyndman 

Twitter Gplus RSS

Why R is better than Excel for teaching statistics

Published on 5 October 2010

This was the topic of a recent con­ver­sa­tion on the Aus­tralian and New Zealand R mail­ing list. Here is an edited list of some of the com­ments made.

  • R is free.
  • R is well-​​documented.
  • R runs (really well) on *nix as well as Win­dows and Mac OS.
  • R is open-​​source. Trust in the R soft­ware is evi­dent by its sup­port among dis­tin­guished sta­tis­ti­cians. How­ever, the R user need not rely on trust, as the source code for R is freely avail­able for pub­lic scrutiny.
  • R has a much broader range of sta­tis­ti­cal pack­ages for doing specialist work.
  • R has an enthu­si­as­tic user base who can offer help­ful advice for free.
  • R cre­ates far bet­ter graph­ics than Excel.
  • R has cer­tain data struc­tures such as data frames that can make analy­sis more straight­for­ward than in Excel
  • R is bet­ter for doing com­plex jobs
  • R is a bet­ter edu­ca­tional tool as it uses stan­dard sta­tis­ti­cal vocab­u­lary rather than home-​​baked terminology.
  • R is eas­ier to learn, use, and script than Excel.
  • R allows stu­dents eas­ily to work with scripts, thus allow­ing the work to be reproducible.
  • R is intended to lead stu­dents towards pro­gram­ming; Excel is designed to keep peo­ple away from pro­gram­ming and encour­ages them to rely on some­one else doing their pro­gram­ming (and often their think­ing) for them.
  • Excel is known to be inac­cu­rate whereas R is thor­oughly tested. For a cri­tique of Excel, see McCul­lough & Heiser (2008).
  • The sta­tis­ti­cal pack­age avail­able in Excel is very lim­ited in capa­bil­ity and should only be used by expe­ri­enced applied sta­tis­ti­cians who can work out when its out­put should be ignored.
  • While R takes a while to learn, it pro­vides a broad range of pos­si­ble analy­ses and does not con­strain users to a very lim­ited set of meth­ods (as is the case for Excel).

Fur­ther com­ments on this theme are avail­able at the fol­low­ing sites:


Related Posts:


 
Tags: ,
10 Comments  comments 
  • Steve

    I think R is pretty dif­fi­cult for peo­ple with­out pro­gram­ming expe­ri­ence. In fact, I still find it chal­leng­ing in some ways, even though I’m a soft­ware devel­oper and have used R for a few years. Regard­less, it is my tool of choice for cre­at­ing graphs and per­form­ing sta­tis­ti­cal analy­ses. I can’t imag­ine try­ing to do that in Excel. What about com­mer­cial stats pro­grams like SAS? I’ve never used them but have won­dered if they’re worth the investment.

  • Jason

    how about SAS’s accu­racy? Is it trust­wor­thy com­pared to R? Has any­body done the analysis?

  • http://www.r-statistics.com Tal Galili

    Inter­est­ing overview.

    I some­how feel that if the other side (When/​why is excel bet­ter then R) was to be pre­sented here, I would feel bet­ter about it — but a good read nonetheless.

    Cheers,
    Tal

  • John

    Excel is supe­rior when your sta­tis­ti­cal analy­sis con­sists of sim­ple addi­tion, sub­trac­tion, divi­sion or mul­ti­pli­ca­tion and your data fits on one or two spread­sheets. (^_​_​^)

  • http://www.analyticbridge.com Vin­cent Granville

    - Excel does not need doc­u­men­ta­tion
    – 500,000 obser­va­tions will crash R; 1,000,000 will crash Excel
    – Installing R pack­ages is com­pli­cated; Excel plu­g­ins (includ­ing the Excel R plu­gin or data analy­sis pack) are easy to install
    – Excel is avail­able on all Win­dows com­put­ers
    – R requires you to learn a pro­gram­ming lan­guage
    – You can share inter­ac­tive Excel spread­sheets with top man­age­ment; you can only share sta­tic R charts with top man­age­ment
    – You can imple­ment sophis­ti­cated analy­ses in Excel, such as hid­den deci­sion trees or con­strained logis­tic regres­sion (with­out using Macros /​ Cubes /​ VBA /​ Pivot tables), in a way that is eas­ier than R

    • http://robjhyndman.com Rob J Hyndman

      - Excel DOES need doc­u­men­ta­tion. For exam­ple, there is no infor­ma­tion pro­vided about how it com­putes sam­ple quan­tiles, what algo­rithm it uses for ill-​​conditioned regres­sions, etc. And the results from Excel do not match those from other soft­ware, so some expla­na­tion is required.
      – Really? Try x <- matrix(rnorm(1e7),1e6,10). Works fine for me.
      – Com­pli­cated? In R you have click the menu item “Packages/​Install pack­ages” and the click the pack­age you want to install. Three mouse clicks isn’t so hard is it?
      – R is avail­able on all Win­dows com­put­ers too, plus *nix and Macs.
      – To do any­thing seri­ous or non-​​standard in Excel you need some VBA, also a pro­gram­ming lan­guage.
      – I don’t know how to do hid­den deci­sion trees or con­strained logis­tic regres­sion in Excel, so I can’t com­ment on how easy it is. But there are pack­ages for both in R that are easy to install and use.

      I’m sure there are good rea­sons to use Excel for some tasks, but you haven’t pro­vided them here.

  • Pingback: Tweets that mention Why R is better than Excel for teaching statistics | Research tips -- Topsy.com

  • http://mikekr.blogspot.com zbi­cy­clist

    You can make MBA stu­dents feel they ought to learn the soft­ware if you use Excel to teach; with any­thing else they respond poorly to learn­ing an “exotic” tool they will “never use again”.

    That said, when was the last time there was an upgrade of the sta­tis­tics capa­bil­i­ties in Excel? It’s such a money maker for Microsoft they could eas­ily afford to improve the clunky, error-​​prone inter­face. Do they care?

  • http://wjmc.blogspot.com William J McKibbin

    I think the research world needs to take another look at Excel, and espe­cially Excel 2010. I often run into gen­eral crit­ics of Excel that are no longer valid given the robust char­ac­ter of Excel 2010, espe­cially when sup­ported with industry-​​standard add-​​in tools such as Stat­Tools, XLStat, Sta­tis­tiXL, and many oth­ers. More­over, vir­tu­ally all of the lead­ing “stand­alone) ana­lyt­ics plat­forms such as MiniTab, SPSS, and JMP, all inter­face with Excel. For busi­ness sta­tis­tics, Excel is the plat­form of choice and I’ll stake my rep­u­ta­tion on that claim (http://​www​.mck​ib​bi​nusa​.com). Thank you for the oppor­tu­nity to comment…

  • http://iguessihadtoputsomething.html Dason Kurkiewicz

    It might be the case that Excel 2010 is bet­ter (I wouldn’t know) but the Microsoft group has con­sis­tently shown that they don’t care about any sort of qual­ity in their sta­tis­ti­cal offer­ings. The argu­ment of “Look! It’s FINALLY at an ok level!” doesn’t hold with me because R has con­sis­tently been a good prod­uct. Plus it’s free and open source.

    The gen­eral pop­u­la­tion might feel more com­fort­able with Excel but that’s really just because it’s the only thing they’ve ever used. Change isn’t bad.