Posts tagged journals

Twenty rules for good graphics

One of the things I repeat­edly include in ref­eree reports, and in my responses to authors who have sub­mit­ted papers to the Inter­na­tional Jour­nal of Fore­cast­ing, are com­ments designed to include the qual­ity of the graph­ics. Recently some­one asked on stats.stackexchange.com about best prac­tices for pro­duc­ing plots. So I thought it might be help­ful to col­late some of the answers given there and add a few com­ments of my own taken from things I’ve writ­ten for authors.

The fol­low­ing “rules” are in no par­tic­u­lar order.

  1. Use vec­tor graph­ics such as eps or pdf. These scale prop­erly and do not look fuzzy when enlarged. Do not use jpeg, bmp or png files as these will look fuzzy when enlarged, or if saved at very high res­o­lu­tions will be enor­mous files. Jpegs in par­tic­u­lar are designed for pho­tographs not sta­tis­ti­cal graphics.
  2. Use read­able fonts. For graph­ics I pre­fer sans-serif fonts such as Hel­vetica or Arial. Make sure the font size is read­able after the fig­ure is scaled to what­ever size it will be printed.
  3. Avoid clut­tered leg­ends. Where pos­si­ble, add labels directly to the ele­ments of the plot rather than use a leg­end at all. If this won’t work, then keep the leg­end from obscur­ing the plot­ted data, and make it small and neat.
  4. If you must use a leg­end, move it inside the plot, in a blank area.
  5. No dark shaded back­grounds. Excel always adds a nasty dark gray back­ground by default, and I’m always ask­ing authors to remove it. Graph­ics print much bet­ter with a white back­ground. The ggplot for R also uses a gray back­ground (although it is lighter than the Excel default). I don’t mind the ggplot ver­sion so much as it is used effec­tively with white grid lines. Nev­er­the­less, even the light gray back­ground doesn’t lend itself to printing/photocopying. White is better.
  6. Avoid dark, dom­i­nat­ing grid lines (such as those pro­duced in Excel by default). Grid lines can be use­ful, but they should be in the back­ground (light gray on white or white on light gray).
  7. Keep the axis lim­its sen­si­ble. You don’t have to include a zero (even if Excel wants you to). The defaults in R work well. The basic idea is to avoid lots of white space around the plot­ted data.
  8. Make sure the axes are scaled prop­erly. Another Excel prob­lem is that the hor­i­zon­tal axis is some­times treated cat­e­gor­i­cally instead of numer­i­cally. If you are plot­ting a con­tin­u­ous numer­i­cal vari­able, then the hor­i­zon­tal axis should be prop­erly scaled for the numer­i­cal variable.
  9. Do not for­get to spec­ify units.
  10. Tick inter­vals should be at nice round numbers.
  11. Axes should be prop­erly labelled.
  12. Use linewidths big enough to read. 1pt lines tend to dis­ap­pear if plots are shrunk.
  13. Avoid over­lap­ping text on plot­ting char­ac­ters or lines.
  14. Fol­low Tufte’s prin­ci­ples by remov­ing chart junk and keep­ing a high data-ink ratio.
  15. Plots should be self-explanatory, so included detailed captions.
  16. Use a sen­si­ble aspect ratio. I think width:height of about 1.6 works well for most plots.
  17. Pre­pare graph­ics in the final aspect ratio to be used in the pub­li­ca­tion. Dis­torted fonts look awful.
  18. Use points not lines if ele­ment order is not relevant.
  19. When prepar­ing plots that are meant to be com­pared, use the same scale for all of them. Even bet­ter, com­bine plots into a sin­gle graph if they are related.
  20. Avoid pie-charts. Espe­cially 3d pie-charts. Espe­cially 3d pie-charts with explod­ing wedges. I promise all my stu­dents an instant fail if I ever see any­thing so appalling.

The clas­sic books on graph­ics are:

These are both highly rec­om­mended. (If you can’t see the books above, turn off your ad-blocker.)

  • Share/Bookmark

Tags: , ,

The falling standard of English in research

It seems that most jour­nals no longer do any seri­ous copy-editing, and the stan­dard of Eng­lish is falling. Today I was read­ing an arti­cle from the Euro­pean Jour­nal of Oper­a­tional Research, which is a sup­pos­edly a good OR jour­nal (cur­rent impact fac­tor over 2). Take this for an exam­ple from the first page of this paper:

If the learned pat­terns are unsta­ble, the learn­ing tools would pro­duce incon­sis­tent con­cepts. To over­come this dif­fi­cult sit­u­a­tion, we employed arti­fi­cial neural net­works (ANNs, NNs) for help­ing the learn­ing task. NNs have attracted a lot of atten­tion form aca­d­e­mic researchers and indus­trial prac­ti­tion­ers because of the pow­er­ful flex­i­ble non­lin­ear mod­el­ing capa­bil­ity ([Balestrassi et al., 2009], [Bellini and Figa-Talamanca, 2005] and [Qi and Zhang, 2001]). It is the main rea­son for their pop­u­lar­ity that the data dri­ven tools have less restric­tion when applying. Learning tools with the sta­ble train­ing base usu­ally have reli­able performances.

The paper con­tin­ues in this vein for ten pages, cul­mi­nat­ing in an equally remark­able conclusion:

With the sam­ple size grow­ing, the shadow set con­tains a large num­ber of func­tional, vir­tual data, instead of whole real data. It would pos­sess less pop­u­la­tion rep­re­sen­ta­tion then. Before estab­lish­ing the the­o­ret­i­cal basis, we used the trial-and-error way for the expe­di­ent expla­na­tion and con­cluded that the vir­tual data size should be 10 at most in this case.

How did that get passed the asso­ciate edi­tor, edi­tor, copy-editor and type­set­ter? Did every­one really think it was ok, or did the paper get pub­lished with­out any of them actu­ally read­ing it prop­erly? It sounds like some­thing out of an auto­matic trans­la­tion pro­gram such as Google trans­late, although I sus­pect that Google trans­late may do rather better.

The des­per­ate rush to pub­lish as much and as often as pos­si­ble has led to a del­uge of badly expressed sen­tences, cob­bled together to look like an arti­cle, but often express­ing lit­tle of value.

Of course, one of the rea­sons for the rise of barely read­able Eng­lish is the increas­ing num­ber of papers writ­ten by researchers whose first lan­guage is not Eng­lish. I feel for them—I couldn’t write a sin­gle sen­tence in any other lan­guage. How­ever, there are ser­vices avail­able to help. In fact, in the sub­mis­sion guide­lines for Else­vier jour­nals, authors are advised to visit www.elsevier.com/wps/find/authorshome.authors/languagepolishing if they need assistance.

For authors who can’t afford to use such ser­vices, and even for authors whose Eng­lish is pass­able, jour­nals need copy-editors. Unfor­tu­nately, it seems the prob­lem is often the poor qual­ity of the work done by copy-editors employed by the jour­nal publishers.

One of the first things I did when I took over as Editor-in-Chief at the Inter­na­tional Jour­nal of Fore­cast­ing was replace the copy-editing team employed by Else­vier (the same group respon­si­ble for the above para­graphs) and install my own copy-editor who can at least rec­og­nize bad Eng­lish when she sees it. Fur­ther­more, I con­vinced Else­vier that they should pay for her. As a result, I think our pub­lished papers are now of a much higher qual­ity than they were a few years ago. Hope­fully that means they are read more, cited more and have greater impact. I wish other jour­nals would do the same.

  • Share/Bookmark

Tags: ,

Should you make your working papers public?

There seems to be two points of view on this with dif­fer­ent prac­tices in dif­fer­ent disciplines.

  1. Some researchers do not make their work pub­lic until after it has been accepted for pub­li­ca­tion in a jour­nal. Until that time, drafts of papers are only cir­cu­lated to close con­fi­dants and usu­ally marked “Do not distribute”.
  2. Work­ing papers are pub­lished on web sites and in web repos­i­to­ries (such as arXiv or RePEc) as soon as they are fin­ished, at about the same time they are sub­mit­ted to a journal.

Because I work with peo­ple in lots of dif­fer­ent fields, I come across both of these prac­tices. In the first sit­u­a­tion, I don’t post the work­ing paper on my web­site until all coau­thors agree, which is not until the paper is accepted at a jour­nal. In the sec­ond sit­u­a­tion, I post the work­ing paper on my web­site (and usu­ally also on RePEc) as soon as possible.

I don’t like the secrecy model at all, but it is hard to con­vince coau­thors who have been trained under that process to change. Different jus­ti­fi­ca­tions are given for keep­ing things secret, depend­ing on who I ask. Here are some of them (in bold) with my thoughts on why the stated rea­sons make lit­tle sense.

  1. It pre­vents rival research groups know­ing what you are up to, and so allows you to stay one step ahead of every­one else. Of course, if every­one does this, then it is just as likely that your rival researchers are ahead already in ways you don’t know about. The result is that there is slower progress because there is not a free flow of infor­ma­tion between research groups. Also, since you don’t know what every­one else is doing, you are more likely to miss some­thing impor­tant that some­one else is work­ing on and waste a lot of time in the process. The most effi­cient pro­ce­dure is for infor­ma­tion to be shared as quickly and com­pletely as pos­si­ble. Yes, that helps your rivals, but it also helps you, and it helps progress in research.
  2. It pre­vents other researchers steal­ing your ideas before they are pub­lished. Pre­sum­ably the fear is that the work­ing paper will be leaked and some­one will copy the ideas and pub­lish it under their own name. There is a sim­ple solu­tion to this: pub­lish the work­ing paper under your own name with a date on it, prefer­ably in a pub­lic repos­i­tory. Then there is no motive for steal­ing the idea because it will eas­ily be shown that you did it first. Keep­ing work­ing papers secret makes it more likely that some­one will steal your ideas, not less likely.
  3. The work­ing paper may change sub­stan­tially before pub­li­ca­tion. That is true, but so what? Every­one knows that a work­ing paper is sub­ject to revi­sion before pub­li­ca­tion. It should be seen as an advance draft to sig­nal to every­one what you have done, and to enable them to start cit­ing it. There is the prob­lem of embar­rass­ing mis­takes being made pub­lic. Wait­ing until a jour­nal accepts the paper reduces the like­li­hood of embar­rass­ing mis­takes, but it doesn’t remove it entirely. Every­one who has pub­lished more than a hand­ful of papers will have writ­ten papers that con­tain errors, even with the ref­er­ee­ing process. If you are wor­ried about never mak­ing a pub­lic mis­take, you prob­a­bly shouldn’t be involved in research.
  4. Hav­ing a pub­lished work­ing paper may be against the jour­nal rules. I don’t know of any jour­nal that won’t pub­lish a paper if it has appeared in work­ing paper form. Most jour­nals not only explic­itly allow it, but also allow the work­ing paper to con­tinue to appear online even after the paper has appeared in a journal.
  5. The ref­er­ees will know who wrote it. This is true. A ref­eree can use Google to dis­cover the authors of a pub­lished work­ing paper. But does that really mat­ter? The blind ref­er­ee­ing model is based on the assump­tion that ref­er­ees will give bet­ter assess­ments if they don’t know who the authors are. I’m not sure that is true, and I haven’t seen any empir­i­cal evi­dence to sup­port it. Any­way, I don’t care if the ref­er­ees know that I am the author of the papers they are reviewing.

On the other hand, there are good rea­sons to have your work­ing papers dis­trib­uted widely and early.

  1. It increases your cita­tions. The more widely the paper is dis­trib­uted the more likely peo­ple are to cite it. Fur­ther, pub­lic repos­i­to­ries such as arXiv and RePEc are free, so a lot more peo­ple have access to the papers stored there then the papers pub­lished in the jour­nals which require expen­sive sub­scrip­tions. If the paper is only being pub­lished (and made pub­lic) a cou­ple of years after the ideas have been devel­oped, it is likely that research has moved on and your paper is not so rel­e­vant and there­fore not so citable.
  2. It pre­vents other researchers steal­ing your ideas because the ideas are dated and doc­u­mented ear­lier, as explained above.
  3. It allows feed­back from a wider range of peo­ple. I get email from a lot of peo­ple who read my work­ing papers, and some of them have some use­ful com­ments that can lead to improve­ments in the paper. It would be too late if these com­ments were received after it was published.

Part of the rea­son for this post is to con­vince my coau­thors that the secrecy prac­tice is a bad idea, even if every­one does it in your field. The only way to change the sit­u­a­tion is to start pub­lish­ing work­ing papers, and try­ing to con­vince every­one else to do the same. I hope this post will help that happen.

Feel free to com­ment if you agree or dis­agree. I’m espe­cially inter­ested in any other rea­sons peo­ple have for and against pub­lish­ing work­ing papers.

  • Share/Bookmark

Tags: , ,

Google scholar alerts

A cou­ple of weeks ago, Google scholar added a facil­ity to pro­vide email alerts on new arti­cles asso­ci­ated with spe­cific search queries. First do the search, then click the enve­lope at top left of screen. For exam­ple, here is a search on “expo­nen­tial smooth­ing” since 2000.

Note the enve­lope at the top marked New! Click it to get the fol­low­ing screen.

Those results show some of the flaws in Google Scholar — the dates are not always cor­rect (the first paper listed above appeared in 2004) and there are unre­solved duplicates.

Despite the prob­lems, if you’re want­ing to keep an eye out for new papers on par­tic­u­lar top­ics, this looks like it could be use­ful. Unfor­tu­nately, there is no RSS feed available.

  • Share/Bookmark

Tags: , ,

Saving web pages for later reading

Often I’ll come across a web­page that I want to read, but it is going to take more time than I have. For exam­ple, it might be an online research paper or a news­pa­per arti­cle, or it could be a lengthy blog arti­cle that you would like to read but you don’t want to sub­scribe to the whole blog. Of course, you could just book­mark the page, but book­mark col­lec­tions tend to grow wild and you might for­get to come back to it.

There is a very neat solu­tion to this prob­lem if you already use Google Reader (or some other feed reader) for online read­ing. Here are the required steps for set­ting it up.

  1. Set up an account on Instapa­per. This is a tool for sav­ing web pages for later reading.
  2. Go to http://www.instapaper.com/u and add the RSS feed (link at bot­tom right) to your Google Reader account.
  3. Read Later ← Drag this to your Bookmarks Bar.

Now every time you want to save something for later reading, just click the "Read Later" bookmark. A "Saved!" message will briefly appear in the corner of the page. The page will be saved to your Instapaper account, and so will automatically appear on Google Reader. Assuming you are in the habit of checking Google Reader every day or two, there's nothing else to remember.

You could read the page within your Instapaper account, but then you would have to remember to look at one more website, and I find it much simpler if everything I want to read turns up in Google Reader automatically. Apart from setting up the Instapaper account, you should never need to go back to the Instapaper website again.

It sometimes takes an hour or so for a page to turn up in your Google Reader account due to the frequency of refreshing the feed at Instapaper. But since you are saving it to read later, that is hardly a problem.

There are other similar services to Instapaper including "Read it later”. Pre­sum­ably you could do some­thing sim­i­lar with these other ser­vices, but I haven’t tried them.

  • Share/Bookmark

Tags: ,

Using Google Reader

Google Reader is a fan­tas­tic way to keep track of new papers that are appear­ing in many dif­fer­ent jour­nals, and also to fol­low some of the inter­est­ing research blogs (and blogs on other top­ics) that are out there. Google Reader checks web­sites for you and lets you know of any new mate­r­ial that appears. Instead of you hav­ing to look at dozens of dif­fer­ent web­sites to dis­cover new infor­ma­tion, all you need to do is open up Google Reader and all the infor­ma­tion comes to you. In some ways it is like an email account, but where the mes­sages con­tain new addi­tions to web­sites that you are inter­ested in.

Google Reader is called an “RSS reader” because it reads RSS feeds. RSS stands for “Really Sim­ple Syn­di­ca­tion”. A web­site with an RSS feed makes it pos­si­ble to track addi­tions to the site with­out actu­ally vis­it­ing it your­self.  There are other RSS read­ers, but Google Reader is the most widely used. Recently Google Reader added a facil­ity so that it now also tracks sites that don’t have RSS feeds.

If you haven’t used it before, here’s how to get started.

  1. Go to www.google.com/reader and log in. If you already have a Google account (e.g., you’re a Gmail user), then just use your usual Google details. If you don’t have a Google account, then you will need to set one up.

     

  2. Click “Add sub­scrip­tion” and type the URL of any web­site you want to track.
  3. When you are read­ing a web­site that you would like to sub­scribe to, click the orange RSS but­ton that looks like this: .
    A mod­ern browser such as Fire­fox or Chrome will fig­ure out that you want to sub­scribe to the RSS feed. If that doesn’t work, just copy the link address and paste it into the “Add sub­scrip­tion” box in Google Reader.

Each morn­ing I read through any­thing new on Google Reader includ­ing new research papers in jour­nals that I track, new arti­cles on some sta­tis­tics blogs that I fol­low, etc. In fact, I have over 500 sub­scrip­tions! I don’t read every arti­cle or it would take all day, but I do scan the head­lines and read what looks interesting.

It can take a while to col­lect all the sub­scrip­tions for jour­nals you might want to read. To make it easy, you can just piggy-back on my jour­nal col­lec­tion (which cov­ers all sta­tis­tics jour­nals, both fore­cast­ing jour­nals, plus a few econo­met­rics and demog­ra­phy jour­nals, as well as all sta­tis­ti­cal preprints on arxiv). Click here if you want to sub­scribe to all the same jour­nals as me.

If you are inter­ested in R, R-bloggers is very use­ful as it com­bines the posts from a large num­ber of blogs about R.  Just go to the site and click on the RSS feed icon and you will be able to add a sub­scrip­tion to your Google Reader account.

For those who like to keep up with LaTeX, the TeX com­mu­nity aggre­ga­tor does some­thing sim­i­lar for blog­gers writ­ing about LaTeX and related top­ics. Again, just click on the RSS feed icon.

Here is a list of sta­tis­tics research blogs. Check them out and sub­scribe to any­thing that takes your fancy.

This web­site has an RSS feed, as do my other web­sites. Just click the orange but­ton at the top-right of the page and select “Google Reader” and then you will receive any new posts I make in your Google Reader account.

  • Share/Bookmark

Tags: , , , ,

Why referee?

There are sev­eral rea­sons why researchers should be will­ing to pro­vide ref­eree reports.

  1. You learn a lot. If the paper is in your area, then writ­ing a ref­eree report forces you to read it very care­fully and engage closely with the research of other peo­ple in your field. There’s no bet­ter way of under­stand what is going on in your field.
  2. You get bet­ter known by the research lead­ers in your area. It is essen­tial to your research career to develop an inter­na­tional rep­u­ta­tion for a high stan­dard of schol­ar­ship. Once known, you may get asked to sub­mit an invited paper to the jour­nal, become an asso­ciate edi­tor of the jour­nal, write a com­men­tary on another paper, etc. Oppor­tu­ni­ties will open up if you are known to be a good referee.
  3. You get to see the lat­est research before every­one else. Often, an author won’t release a work­ing paper or pre-print before the paper has gone through a round of ref­er­ee­ing (and some authors keep things to them­selves until a paper is accepted). So you get a head-start on every­one else if you ref­eree the paper. It might lead you to develop some of your own ideas and write a new paper that builds on the results.
  4. If you sub­mit papers to jour­nals your­self, and ben­e­fit from the ref­eree reports that you receive, then you should be will­ing to do the same for oth­ers. The whole sys­tem is built on researchers pro­vid­ing a mutu­ally ben­e­fi­cial ser­vice, and if you want to par­tic­i­pate in the sys­tem then you should be will­ing to con­tribute to it.

On the other hand, you do some­times need to be selec­tive. Politely decline if you think the paper is not close enough to your own inter­ests to be worth spend­ing time on, or if the jour­nal is not one you are likely to ever pub­lish in, or if you don’t feel capa­ble of under­stand­ing the paper well enough, or if you have already writ­ten three reports in the past three weeks. If you do say no, it is very help­ful if you can rec­om­mend some­one who would be suitable.

  • Share/Bookmark

Tags: ,

Writing a referee report

As an edi­tor, I like to see ref­eree reports com­pris­ing three sections:

  1. A gen­eral sum­mary of the paper and the con­tri­bu­tion it makes. You need to high­light here what is new and inter­est­ing about the paper, as well as give a sum­mary in a few sentences.
  2. The major prob­lems that need address­ing.  This is prob­a­bly the most impor­tant sec­tion of your report where you explain the main prob­lems. The edi­tor will read this very care­fully when decid­ing whether to accept, reject or invite a revi­sion, so you need to make sure that any prob­lems are clearly explained here. If you think the paper should be rejected, then you have to make a good case in this sec­ond sec­tion. On the other hand, if you think it is a great paper that deserves pub­li­ca­tion, please explain what is so good about it.
  3. Minor things such as typos or points of clar­i­fi­ca­tion. These are often less impor­tant issues, but need cor­rect­ing before publication.

Some ref­eree reports com­bine sec­tions 2 and 3 and that makes it much harder to fig­ure out what is impor­tant and what are minor com­ments. If the paper is def­i­nitely not worth pub­lish­ing, and you have explained some very seri­ous flaws in sec­tion 2, then it is accept­able not to doc­u­ment the more minor issues. In this case, you should explain to the edi­tor that you have cho­sen not to com­ment on more minor issues as you didn’t think it worthwhile.

Don’t include a rec­om­men­da­tion about whether to pub­lish or not in the report, but add it in your cov­er­ing note to the edi­tor. This is best as the edi­tor will make a deci­sion based on the com­ments from all the ref­er­ees and they may pro­vide con­flict­ing rec­om­men­da­tions. Also, it is awk­ward if all the ref­er­ees rec­om­mend one thing and the edi­tor decides dif­fer­ently. This doesn’t hap­pen very often, but I have some­times made a deci­sion that is con­trary to the advice of all referees.

  • Share/Bookmark

Tags: ,

Using DOIs

Almost all papers these days have a DOI and it is worth know­ing how to use them.

At the top or bot­tom of the first page of a paper, you will see some­thing like this:

doi:10.1016/j.csda.2006.07.028

This is a unique and per­ma­nent iden­ti­fier for the paper known as a “Dig­i­tal Object Iden­ti­fier”. The part before the for­ward slash (10.1016 in the exam­ple above) iden­ti­fies the nam­ing author­ity (in this case Else­vier) and the part after the for­ward slash (j.csda.2006.07.028) iden­ti­fies the par­tic­u­lar paper. In this case, the paper iden­ti­fier shows it is in the jour­nal Com­pu­ta­tional Sta­tis­tics and Data Analy­sis and that it first appeared online in 2006. How­ever, there is no sys­tem­atic pat­tern to these iden­ti­fiers, and other jour­nals use other ways of gen­er­at­ing identifiers.

One use for these num­bers is that it pro­vides a quick way of find­ing the paper online. The URL http://dx.doi.org/xxx where xxx is the DOI will lead to the paper. For exam­ple, http://dx.doi.org/10.1016/j.csda.2006.07.028 gives the above paper.

A URL gen­er­ated in this way is usu­ally much shorter than other equiv­a­lent URLs, and is guar­an­teed not to change, even when the pub­lisher reor­ga­nizes their web­site. For some jour­nals, the URL of an arti­cle may change when the arti­cle moves from being online but not yet allo­cated to an issue, to being part of a print issue. But the DOI will remain the same regard­less. That is why I usu­ally pro­vide links of this form to my online pub­li­ca­tions from my website.

  • Share/Bookmark

Tags: ,

Statistics education journals

In many research uni­ver­si­ties, there can be a ten­sion that arises when great teach­ers don’t pub­lish much. I believe there is a place for excel­lent teach­ers who do lim­ited research within a strong research uni­ver­sity, but their con­tri­bu­tion is con­sid­er­ably enhanced if they share their teach­ing insights. There are at least three rep­utable research jour­nals for pub­lish­ing arti­cles on sta­tis­tics education:

In addi­tion, there is the less research-oriented (but cer­tainly not less use­ful) Teach­ing Sta­tis­tics.

  • Share/Bookmark

Tags: ,