Using the inference function

Dependencies for inference() function

Before you can run the inference() function, you need to check that you have the ggplot2, gridExtra, and broom packages installed. If you installed the tidyverse package, then you should already have ggplot2 and broom, but you may not have gridExtra installed. To check which packages you have available, copy and paste the following code snippet into your Console on RStudio:

installed.packages()[, "Package"][
  names(installed.packages()[, "Package"]) %in% c("tidyverse", "ggplot2",
                                                  "gridExtra", "broom")]

If you have the packages, your output should look something like this:

      broom     ggplot2   gridExtra   tidyverse
    "broom"   "ggplot2" "gridExtra" "tidyverse"

It’s okay if tidyverse isn’t there but broom and ggplot2 are. If you are missing any of the required packages, then you need to install them. For example, to install gridExtra, you would run:

install.packages("gridExtra")

Once you have the necessary packages, move on to the next section.

Loading the inference() function

The inference() function is stored in an RData file on the course website:

You can attempt to load it directly into your R session by running the following in your Console or RMarkdown file:

load(url("http://fall17.cds101.com/R/inference.RData"))

If that does not work, then you will need to download the file by clicking the above link. After downloading it, use your operating system’s file manager to find the downloaded file and move it to your working directory, such as your current Github repository folder. To double check that you moved it to the correct place, go back to RStudio, open your Console, and run:

list.files()

If you don’t see "inference.RData" as one of the listed files, then run getwd() in Console to check your current working directory. If this doesn’t match where you copied inference.RData, then you need to move it to the directory printed out by getwd().

Once you have inference.RData in the correct place, run the following to make inference() available in R.

load("inference.RData")

inference() documentation

Description

Run hypothesis tests on and compute confidence intervals for a response variable using either CLT-based or simulation-based methods.

Arguments

Input Default Description
y response variable, can be numerical or categorical
x NULL explanatory variable, categorical (optional)
data name of data.frame that y and x are in
type c(“ci”, “ht”) ci (confidence interval) or ht (hypothesis test)
statistic c(“mean”, “median”, “proportion”) parameter to estimate: mean, median, or proportion
success NULL which level of the categorical variable to call “success”, i.e. do inference on
order NULL when x is given, order of levels of x in which to subtract parameters
method c(“theoretical”, “simulation”) CLT based (theoretical) or simulation based (randomization/bootstrap)
null NULL null value for a hypothesis test
alternative c(“less”, “greater”, “twosided”) direction of the alternative hypothesis
sig_level 0.05 significance level, value between 0 and 1 (used only for ANOVA to determine if posttests are necessary)
conf_level 0.95 confidence level, value between 0 and 1
boot_method c(“perc”, “se”) bootstrap method; percentile (perc) or standard error (se)
nsim 15000 number of simulations
seed NULL seed to be set, default is NULL
verbose TRUE whether output should be verbose or not
show_var_types verbose print variable types, set to verbose by default
show_summ_stats verbose print summary stats, set to verbose by default
show_eda_plot verbose print EDA plot, set to verbose by default
show_inf_plot verbose print inference plot, set to verbose by default
show_res verbose print results, set to verbose by default