inference()
functionBefore you can run the inference()
function, you need to check that you have the ggplot2
, gridExtra
, and broom
packages installed. If you installed the tidyverse
package, then you should already have ggplot2
and broom
, but you may not have gridExtra
installed. To check which packages you have available, copy and paste the following code snippet into your Console on RStudio:
installed.packages()[, "Package"][
names(installed.packages()[, "Package"]) %in% c("tidyverse", "ggplot2",
"gridExtra", "broom")]
If you have the packages, your output should look something like this:
broom ggplot2 gridExtra tidyverse
"broom" "ggplot2" "gridExtra" "tidyverse"
It’s okay if tidyverse
isn’t there but broom
and ggplot2
are. If you are missing any of the required packages, then you need to install them. For example, to install gridExtra
, you would run:
install.packages("gridExtra")
Once you have the necessary packages, move on to the next section.
inference()
functionThe inference()
function is stored in an RData file on the course website:
You can attempt to load it directly into your R session by running the following in your Console or RMarkdown file:
load(url("http://fall17.cds101.com/R/inference.RData"))
If that does not work, then you will need to download the file by clicking the above link. After downloading it, use your operating system’s file manager to find the downloaded file and move it to your working directory, such as your current Github repository folder. To double check that you moved it to the correct place, go back to RStudio, open your Console, and run:
list.files()
If you don’t see "inference.RData"
as one of the listed files, then run getwd()
in Console to check your current working directory. If this doesn’t match where you copied inference.RData
, then you need to move it to the directory printed out by getwd()
.
Once you have inference.RData
in the correct place, run the following to make inference()
available in R.
load("inference.RData")
inference()
documentationRun hypothesis tests on and compute confidence intervals for a response variable using either CLT-based or simulation-based methods.
Input | Default | Description |
---|---|---|
y | response variable, can be numerical or categorical | |
x | NULL | explanatory variable, categorical (optional) |
data | name of data.frame that y and x are in | |
type | c(“ci”, “ht”) | ci (confidence interval) or ht (hypothesis test) |
statistic | c(“mean”, “median”, “proportion”) | parameter to estimate: mean, median, or proportion |
success | NULL | which level of the categorical variable to call “success”, i.e. do inference on |
order | NULL | when x is given, order of levels of x in which to subtract parameters |
method | c(“theoretical”, “simulation”) | CLT based (theoretical) or simulation based (randomization/bootstrap) |
null | NULL | null value for a hypothesis test |
alternative | c(“less”, “greater”, “twosided”) | direction of the alternative hypothesis |
sig_level | 0.05 | significance level, value between 0 and 1 (used only for ANOVA to determine if posttests are necessary) |
conf_level | 0.95 | confidence level, value between 0 and 1 |
boot_method | c(“perc”, “se”) | bootstrap method; percentile (perc) or standard error (se) |
nsim | 15000 | number of simulations |
seed | NULL | seed to be set, default is NULL |
verbose | TRUE | whether output should be verbose or not |
show_var_types | verbose | print variable types, set to verbose by default |
show_summ_stats | verbose | print summary stats, set to verbose by default |
show_eda_plot | verbose | print EDA plot, set to verbose by default |
show_inf_plot | verbose | print inference plot, set to verbose by default |
show_res | verbose | print results, set to verbose by default |