tidygapminder.Rmd
This package aims to make really easy to tidy data retrieved from Gapminder. A the begining is:
library(tidygapminder)
When you have loaded the package you are now in possesion of two super powers (functions): tidy_indice and tidy_bunch.
tidy_indice
fucntion tidy as explain above tidy a data sheet downloaded on Gapminder. This data sheet can be either in csv or xlsx as indicated on the gapminder site.
tidy_indice
take as argument the path to the file and return the data as a tidy data frame.
filepath <- system.file("extdata", "life_expectancy_years.csv", package = "tidygapminder")
tidy_indice(filepath)
#> # A tibble: 40,953 x 3
#> country year life_expectancy_years
#> <chr> <dbl> <dbl>
#> 1 Afghanistan 1800 28.2
#> 2 Afghanistan 1801 28.2
#> 3 Afghanistan 1802 28.2
#> 4 Afghanistan 1803 28.2
#> 5 Afghanistan 1804 28.2
#> 6 Afghanistan 1805 28.2
#> 7 Afghanistan 1806 28.1
#> 8 Afghanistan 1807 28.1
#> 9 Afghanistan 1808 28.1
#> 10 Afghanistan 1809 28.1
#> # … with 40,943 more rows
tidy_bunch
makes use of tidy_indice
to tidy a whole set of data sheets and have the options to merge all data frames into one big data frame with merge
set to TRUE
:
dir_path <- system.file("extdata", "gapminder", package = "tidygapminder")
tidy_bunch(dir_path, merge = TRUE)
#> We take in only csv, xls or xlsx files
#> # A tibble: 55,462 x 4
#> country year aid_received_per_person_… income_per_person_gdppercapita_pp…
#> <chr> <dbl> <dbl> <dbl>
#> 1 Afghanist… 1960 1.91 NA
#> 2 Afghanist… 1961 3.78 NA
#> 3 Afghanist… 1962 1.81 NA
#> 4 Afghanist… 1963 3.85 NA
#> 5 Afghanist… 1964 4.74 NA
#> 6 Afghanist… 1965 5.43 NA
#> 7 Afghanist… 1966 4.96 NA
#> 8 Afghanist… 1967 3.91 NA
#> 9 Afghanist… 1968 2.76 NA
#> 10 Afghanist… 1969 2.51 NA
#> # … with 55,452 more rows
Enjoy!!!