R/inference.R
rater.Rd
This functions allows the user to fit statistical models of noisy categorical rating, based on the Dawid-Skene model, using Bayesian inference. A variety of data formats and models are supported. Inference is done using Stan, allowing models to be fit efficiently, using both optimisation and Markov Chain Monte Carlo (MCMC).
rater(
data,
model,
method = "mcmc",
data_format = "long",
long_data_colnames = c(item = "item", rater = "rater", rating = "rating"),
inits = NULL,
verbose = TRUE,
...
)
A 2D data object: data.frame, matrix, tibble etc. with data in either long or grouped format.
Model to fit to data - must be rater_model or a character string - the name of the model. If the character string is used, the prior parameters will be set to their default values.
A length 1 character vector, either "mcmc"
or "optim"
.
This will be fitting method used by Stan. By default "mcmc"
A length 1 character vector, "long"
, "wide"
and
"grouped"
. The format that the passed data is in. Defaults to "long"
.
See vignette("data-formats)
for details.
A 3-element named character vector that specifies
the names of the three required columns in the long data format. The vector
must have the required names:
* item: the name of the column containing the item indexes,
* rater: the name of the column containing the rater indexes,
* rating: the name of the column containing the ratings.
By default, the names of the columns are the same as the names of the
vector: "item"
, "rater"
, and "rating"
respectively. This argument is
ignored when the data_format
argument is either "wide"
or "grouped"
.
The initialization points of the fitting algorithm
Should rater()
produce information about the progress
of the chains while using the MCMC algorithm. Defaults to TRUE
Extra parameters which are passed to the Stan fitting interface.
An object of class rater_fit containing the fitted parameters.
The default MCMC algorithm used by Stan is No U Turn Sampling (NUTS) and the default optimisation method is LGFGS. For MCMC 4 chains are run be default with 2000 iterations in total each.
# \donttest{
# Fit a model using MCMC (the default).
mcmc_fit <- rater(anesthesia, "dawid_skene")
#>
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000233 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.33 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 2.698 seconds (Warm-up)
#> Chain 1: 2.915 seconds (Sampling)
#> Chain 1: 5.613 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 0.000177 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 1.77 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 2.75 seconds (Warm-up)
#> Chain 2: 2.866 seconds (Sampling)
#> Chain 2: 5.616 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 3).
#> Chain 3:
#> Chain 3: Gradient evaluation took 0.000287 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 2.87 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3:
#> Chain 3:
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 2.679 seconds (Warm-up)
#> Chain 3: 2.847 seconds (Sampling)
#> Chain 3: 5.526 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 4).
#> Chain 4:
#> Chain 4: Gradient evaluation took 0.000182 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.82 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4:
#> Chain 4:
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 2.526 seconds (Warm-up)
#> Chain 4: 2.68 seconds (Sampling)
#> Chain 4: 5.206 seconds (Total)
#> Chain 4:
# Fit a model using optimisation.
optim_fit <- rater(anesthesia, dawid_skene(), method = "optim")
# Fit a model using passing data grouped data.
grouped_fit <- rater(caries, dawid_skene(), data_format = "grouped")
#>
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 6.9e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.69 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.497 seconds (Warm-up)
#> Chain 1: 0.436 seconds (Sampling)
#> Chain 1: 0.933 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 6.5e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.65 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 0.484 seconds (Warm-up)
#> Chain 2: 0.429 seconds (Sampling)
#> Chain 2: 0.913 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 3).
#> Chain 3:
#> Chain 3: Gradient evaluation took 6.5e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.65 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3:
#> Chain 3:
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 0.49 seconds (Warm-up)
#> Chain 3: 0.436 seconds (Sampling)
#> Chain 3: 0.926 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 4).
#> Chain 4:
#> Chain 4: Gradient evaluation took 9.8e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.98 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4:
#> Chain 4:
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 0.498 seconds (Warm-up)
#> Chain 4: 0.437 seconds (Sampling)
#> Chain 4: 0.935 seconds (Total)
#> Chain 4:
# }