Seasonal mortality rates


24 May 2020

time series

The weekly mortality data recently published by the Human Mortality Database can be used to explore seasonality in mortality rates. Mortality rates are known to be seasonal due to temperatures and other weather-related effects (Healy 2003).


Download the data

We will first grab the latest data, using similar code to what I used in my recent post on “excess deaths”. However, this time we will keep the mortality rates rather than the numbers of deaths.

stmf <- readr::read_csv("", skip=1)
mrates <- stmf %>%
  janitor::clean_names() %>%
  select(country_code:sex, r0_14:r_total) %>%
    names_to = "age", values_to = "mxt",
    names_pattern = "[r_]*([a-z0-9_p]*)"
  ) %>%
  filter(age == "total", sex == "b") %>%
    country = recode(country_code,
      AUT = "Austria",
      BEL = "Belgium",
      DEUTNP = "Germany",
      DNK = "Denmark",
      ESP = "Spain",
      FIN = "Finland",
      GBRTENW = "England & Wales",
      ISL = "Iceland",
      NLD = "Netherlands",
      NOR = "Norway",
      PRT = "Portugal",
      SWE = "Sweden",
      USA = "United States"
  ) %>%
  select(year, week, country, mxt)

First let’s plot the mortality rate against the week of the year for two countries with interesting data features.

England & Wales

mrates %>%
  filter(country == "England & Wales") %>%
  mutate(year = as.factor(year)) %>%
  ggplot(aes(x = week, y = mxt, group = year)) +
  geom_line(aes(col = year))

Here we see some annual seasonal pattern, with higher rates in winter, and also a few sudden dips in mortality rates. The latter are almost certainly due to recording discrepancies, where deaths are not recorded until the following week. Note that the dips are generally followed by a higher than usual mortality rate in the following week. Those between weeks 12 and 17 are probably due to Easter; bank holiday effects are seen in weeks 18-19, 22-23 and 35-36 (depending on which year the holiday falls; the Christmas effect is seen in week 52.

The second week of the year always has increased mortality — this is a reporting issue: delayed deaths from the previous week(s) are included in statistics for the second week of the year.

Other than the obvious pandemic effect in 2020, this graph also shows increased mortality rates at the start of 2015 and 2018, and in the first half of March 2018. This is probably due to the flu epidemic.


A similar plot for Spain shows a jump in mortality from weeks 31–34 (August) in 2003. This was due to an extreme heat wave (see Robine et al, 2008).

mrates %>%
  filter(country == "Spain") %>%
  mutate(year = as.factor(year)) %>%
  ggplot(aes(x = week, y = mxt, group = year)) +
  geom_line(aes(col = year))

Comparing seasonality across countries

We can compare the seasonal patterns from all countries by using an STL decomposition to estimate the seasonality. There will be some differences because some countries provide weekly data by date of registration (instead of the date of occurrence). This is why, for example, we see sudden dips in England & Wales but do not see similar dips in Germany. Information about the type of data available is in the metadata.

First we have to convert the data to a tsibble object.

mrates <- mrates %>%
  mutate(date = yearweek(paste0(year, "W", week))) %>%
  as_tsibble(index = date, key = country)

Now we estimate the seasonal components using an STL decomposition. The robust argument is used to prevent the unusual years affecting the results, and the seasonal window is set to periodic as we don’t expect the seasonal pattern to change over the last decade or two. STL decompositions are additive, but it would be more interpretable to look at the percentage increase in mortality rates across the year, so I will decompose the log rates and then compute the seasonal effect as a percentage increase (relative to the mean death rate) for each week of the year. There are a couple of missing values, so I will replace them with something small and the robust estimation should ignore them.

stl_season <- mrates %>%
  fill_gaps(mxt = 0.0001) %>%
  model(STL(log(mxt) ~ season(window = "periodic"), robust = TRUE)) %>%
  components() %>%
    pc_increase = 100*(exp(season_year)-1),
    week = lubridate::week(date),
    year = lubridate::year(date)
  ) %>%
  filter(year == 2019) %>%
  as_tibble() %>%
  select(country, week, pc_increase)

The smoothed components can now be plotted.

stl_season %>%
  ggplot(aes(x = week, y = pc_increase, group = country)) +
  geom_smooth(aes(col = country), span = 0.4, se = FALSE, size = .5)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Most countries are very similar, probably due to them all being in the northern hemisphere and all having well-developed health services. The two that stand out as different from the rest are Portugal (in purple) and Iceland (in green). Let’s just plot these two with confidence intervals, along with Spain for comparison.

stl_season %>%
  filter(country %in% c("Iceland", "Portugal", "Spain")) %>%
  ggplot(aes(x = week, y = pc_increase, group = country)) +
  geom_smooth(aes(col = country), span = 0.4, size = .5) +

The Icelandic rates show less seasonality than other countries, but contain a dip in weeks 25–35 (mid-June to end of August). The Iceland mortality pattern is possibly due to the weather, with a very short summer and cold conditions for the rest of the year.

The Portuguese mortality rates are much higher in winter than other countries, and much lower in summer. This is strange as it has very similar weather to Spain, but very different mortality rates. Some twitter discussion suggested some of the increase in January could be delayed reporting, as few deaths are reported in the last week of the year. Healy (2003) suggested it was due to poor insulation in Portuguese houses, along with relatively high income poverty and inequality and relatively low public health expenditure compared to other European countries. However, it is not clear that this is still true 17 years after he wrote that article.