This functions allows the user to fit statistical models of noisy categorical rating, based on the Dawid-Skene model, using Bayesian inference. A variety of data formats and models are supported. Inference is done using Stan, allowing models to be fit efficiently, using both optimisation and Markov Chain Monte Carlo (MCMC).

rater(
  data,
  model,
  method = "mcmc",
  data_format = "long",
  long_data_colnames = c(item = "item", rater = "rater", rating = "rating"),
  inits = NULL,
  verbose = TRUE,
  ...
)

Arguments

data

A 2D data object: data.frame, matrix, tibble etc. with data in either long or grouped format.

model

Model to fit to data - must be rater_model or a character string - the name of the model. If the character string is used, the prior parameters will be set to their default values.

method

A length 1 character vector, either "mcmc" or "optim". This will be fitting method used by Stan. By default "mcmc"

data_format

A length 1 character vector, "long", "wide" and "grouped". The format that the passed data is in. Defaults to "long". See vignette("data-formats) for details.

long_data_colnames

A 3-element named character vector that specifies the names of the three required columns in the long data format. The vector must have the required names: * item: the name of the column containing the item indexes, * rater: the name of the column containing the rater indexes, * rating: the name of the column containing the ratings. By default, the names of the columns are the same as the names of the vector: "item", "rater", and "rating" respectively. This argument is ignored when the data_format argument is either "wide" or "grouped".

inits

The initialization points of the fitting algorithm

verbose

Should rater() produce information about the progress of the chains while using the MCMC algorithm. Defaults to TRUE

...

Extra parameters which are passed to the Stan fitting interface.

Value

An object of class rater_fit containing the fitted parameters.

Details

The default MCMC algorithm used by Stan is No U Turn Sampling (NUTS) and the default optimisation method is LGFGS. For MCMC 4 chains are run be default with 2000 iterations in total each.

Examples

# \donttest{

# Fit a model using MCMC (the default).
mcmc_fit <- rater(anesthesia, "dawid_skene")
#> 
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 9.5e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.95 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 1.267 seconds (Warm-up)
#> Chain 1:                1.349 seconds (Sampling)
#> Chain 1:                2.616 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 8.6e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.86 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 1.302 seconds (Warm-up)
#> Chain 2:                1.335 seconds (Sampling)
#> Chain 2:                2.637 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 8.5e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.85 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 1.226 seconds (Warm-up)
#> Chain 3:                1.367 seconds (Sampling)
#> Chain 3:                2.593 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL 'dawid_skene' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 8.8e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.88 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 1.239 seconds (Warm-up)
#> Chain 4:                1.186 seconds (Sampling)
#> Chain 4:                2.425 seconds (Total)
#> Chain 4: 

# Fit a model using optimisation.
optim_fit <- rater(anesthesia, dawid_skene(), method = "optim")

# Fit a model using passing data grouped data.
grouped_fit <- rater(caries, dawid_skene(), data_format = "grouped")
#> 
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 3.1e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.258 seconds (Warm-up)
#> Chain 1:                0.228 seconds (Sampling)
#> Chain 1:                0.486 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 2.9e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 0.253 seconds (Warm-up)
#> Chain 2:                0.231 seconds (Sampling)
#> Chain 2:                0.484 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 3e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 0.252 seconds (Warm-up)
#> Chain 3:                0.231 seconds (Sampling)
#> Chain 3:                0.483 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL 'grouped_data' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 2.9e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
#> Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
#> Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
#> Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
#> Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 0.249 seconds (Warm-up)
#> Chain 4:                0.224 seconds (Sampling)
#> Chain 4:                0.473 seconds (Total)
#> Chain 4: 

# }