Gapminder data, minus the mess.
Gapminder is a goldmine of global development data — life expectancy, income, CO₂ emissions, literacy rates, and hundreds more indicators spanning centuries. The catch? Every sheet looks like this:
life expectancy years | 1800 | 1801 | 1802 | ...
----------------------|------|------|------|----
Afghanistan | 28.2 | 28.2 | 28.2 | ...
Albania | 35.4 | 35.4 | 35.4 | ...
...
Countries as rows, years as columns, the indicator name hiding in cell A1. Great for a spreadsheet. Terrible for R.
tidygapminder fixes that in one function call.
Development version
pak::pak(“ebedthan/tidygapminder”)
## Two functions. That's it.
### `tidy_index()`: one file at a time
Point it at a Gapminder `.csv`, `.xlsx`, or `.xls` file and get back a clean
tibble:
```{r}
library(tidygapminder)
csv_path <- system.file("extdata/life_expectancy_years.csv", package = "tidygapminder")
tidy_index(csv_path)
Three columns: country, year, and the indicator, ready to filter, plot, or model.
tidy_bunch(): a whole folder at once
Downloaded ten indicators? No problem. Point tidy_bunch() at the folder:
```{r} dir_path <- system.file(“extdata”, package = “tidygapminder”)
Returns a named list of tibbles, one per file
result <- tidy_bunch(dir_path) names(result)
Want everything in one data frame joined by `country` and `year`?
```{r}
tidy_bunch(dir_path, combine = TRUE)
Why tidygapminder?
- Zero friction: no arguments to learn beyond a file path
- Handles the quirks: indicator name in cell A1, non-numeric year columns, mixed file formats: all taken care of
-
Lightweight: only two dependencies (
readxlandtibble) - Informative errors: tells you exactly what went wrong and where
Getting help
- Read the vignette:
vignette("tidygapminder") - Browse the documentation: https://ebedthan.github.io/tidygapminder/
- Found a bug? Open an issue
