王钟瑶婚礼视频曝光

Data @ Reed

Useful R Packages

Packages are bundles of specialized code that you can add in to go beyond the basic R functions. When you install a package you are adding in extra coding options that can help you analyze or visualize your data more easily. Anyone can write packages and they can be general or very specific, so depending on your task, you may find someone has written a package to make it easier for you.

The terms "package" and "library" are used interchangeably. When using R you will run install.packages() when you need to add a package for the first time, then you run library() to load the package. install.packages() is like buying a book—you only need to do it once—and library() is like getting it off the shelf—you need to do it everytime you want to use that book. 

 

The one package to rule them all

The first step in all of your scripts will likely be this line of code:

library(tidyverse)

is a meta-package that will load many other packages within a single step. When you run the line above, it will load in the following packages for you automatically:

  • ggplot2,
  • dplyr
  • tidyr
  • readr
  • purrr
  • tibble
  • stringr
  • forcats

Below are categories that contain other useful packages. If you are trying to load in data from an online database (ex: US Census) be sure to check out the Direct Data Access libraries. There may be a library that will load your data in for you without the need for you to download it from the website. 

Loading Data

Package Name

What It Does

Learn More

readr for loading in .csv, .txt, and more file types
readxl for loading .xlsx file types and other Excel extensions
haven for loading Stata, SPS, and SPSS files
jsonlite for importing JSON objects and converting to R data types
googlesheets4 for loading data from a Google Drive account
rvest for web-scraping 
duckdb for loading more data than R likes to load; if you have a huge dataset, use this package

Formatting Data

Package Name

What It Does

Learn More

dplyr contains the most commonly used tools for data manipulation
tidyr tools for pivoting tables from wide to long format and vice versa
janitor for cleaning up and standardizing data names
stringr helpful functions for manipulating strings
scales for overriding default settings for significant digits, plot axes, and more
lubridate a must have package for formatting any data that is a date or time
data.table good functions for speeding up analysis when you have large data sets
broom for making your data more tidyverse friendly 
purrr tools for working with functions and vectors, helpful for converting from lists of lists to data frames

Creating Nice Plots & Tables

Package Name

What It Does

Learn More

ggplot2 the best package for making your graphs look nice
gt stands for "great tables" and follow through on its promise
gtsummary works with gt to display publication-ready summary of regressions and more
viridis has pretty color palettes
RColorBrewer has pretty color palettes
ggpubr customization for ggplot2 that helps make publication-ready documents
patchwork works well with ggplot2 to help align multiple plots or tables in one figure or page
gridExtra helps align multiple plots or tables in one figure or page
wesanderson has color palettes that correspond to each Wes Anderson movie
plotly for making your graphs interactive, works well with the shiny package

Useful Stats Packages

Package Name

What It Does

Learn More

stats the main source for statistical functions beyond base R
lme4 for linear regression with mixed-effects models
lmerTest statistical tests for analyzing linear mixed-effect models
MASS for regression analysis of non-linear models
Hmisc a lot of miscellaneous additional functions for statistical analyis
FactoMineR for multivariate exploratory data analysis
outliers many specific tests for detecting outliers
vegan for ordination analyses and diversity stats, particularly good for ecology
car extra tools for regression analysis
cluster tools for performing cluster analysis
forcats tools for working with categorical variables

Direct Data Access

Package Name

Database Accessed

Learn More

tidycensus US Census
rnoaa* National Oceanic and Atmospheric Administration

 

COVID19 daily updates on Covid data
wbstats World Bank data
tidyquant Stock market data
crimedata Crime Open Database
eurostat Eurostat Open Data eurostat documentation
WDI World Bank and World Development Indicators
imf.data International Monetary Fund
fredr Federal Reserve of Economic Data
googleanalyticsR Google Analytics

* They're working on a replacement, but it is still usable.