Skip to contents

This function integrates kmeans clustering and Random Forest classification to analyze flow cytometric Fluorescent Timer data. It applies K-means clustering to both training and test datasets as data frame to create clusters, matches these clusters across datasets, and then uses Random Forest to predict outcomes based on the relative proportions of cells in each cluster.

Usage

TockyKmeansRF(
  trainData,
  testData,
  num_cluster = 4,
  verbose = TRUE,
  iter.max = 10,
  nstart = 1,
  mtry = NULL,
  ntree = 100
)

Arguments

trainData

Training dataset as a TockyPrepData

testData

Test dataset as a TockyPrepData.

num_cluster

The number of clusters (metaclusters) to generate via k-means.

verbose

Logical indicating whether to print progress messages and outputs. Default is TRUE.

iter.max

the maximum number of iterations allowed. To be passed to kmeans.

nstart

the number of random sets to be used in each clustering. To be passed to kmeans.

mtry

The number of variables randomly sampled as candidates at each split when building a tree within the Random Forest. To be passed to randomForest.

ntree

The number of trees to grow in the Random Forest. The default value is set to 100. To be passed to randomForest.

Value

A list containing key Tocky data and the Random Forest model and its performance data.

Examples

if (FALSE) { # \dontrun{
result <- TockyKmeansRF(trainData, testData, num_cluster = 4, verbose = TRUE)
} # }