Evaluate ROC Curves with Confidence Intervals and Calculate AUC for TockyKmeansRF — TockyRFClusterOptimization • TockyRandomForest

This function applies the TockyKmeansRF model to assess the performance of clustering with a specified range of cluster numbers over repeats. It evaluates the model's performance by calculating ROC curves, along with their confidence intervals, and computes the AUC for each curve across multiple runs.

Usage

TockyRFClusterOptimization(
  trainData,
  testData,
  num_cluster_vec = 4:9,
  k = 1,
  ctrl_group = NULL,
  expr_group = NULL,
  iter.max = 10,
  nstart = 1,
  mtry = NULL,
  ntree = 100
)

Arguments

trainData: Training dataset as a TockyPrepData object.
testData: Test dataset as a TockyPrepData object.
num_cluster_vec: A numeric vector of cluster numbers to evaluate.
k: Integer, number of iterations to estimate confidence intervals.
ctrl_group: The name of the control group within `sampledef`.
expr_group: The name of the experimental group within `sampledef`.
iter.max: the maximum number of iterations allowed. To be passed to kmeans.
nstart: the number of random sets to be used in each clustering. To be passed to kmeans.
mtry: The number of variables randomly sampled as candidates at each split when building a tree within the Random Forest. To be passed to randomForest.
ntree: The number of trees to grow in the Random Forest. The default value is set to 100. To be passed to randomForest.

Value

A list containing three elements: 'roc_results' a list of ROC curve objects for each cluster number, 'auc_values' a list of AUC values with their respective confidence intervals for each cluster configuration, and 'roc_plots' a list of ggplot objects each depicting the ROC curve with confidence intervals for the range of specified clusters.

Examples

if (FALSE) { # \dontrun{
result <- TockyRFClusterOptimization(trainData, testData, num_cluster_vec = 4:9, k = 50)
} # }