This function integrates kmeans clustering and Random Forest classification to analyze flow cytometric Fluorescent Timer data. It applies K-means clustering to both training and test datasets as data frame to create clusters, matches these clusters across datasets, and then uses Random Forest to predict outcomes based on the relative proportions of cells in each cluster.
Usage
TockyKmeansRF(
trainData,
testData,
num_cluster = 4,
verbose = TRUE,
iter.max = 10,
nstart = 1,
mtry = NULL,
ntree = 100
)
Arguments
- trainData
Training dataset as a TockyPrepData
- testData
Test dataset as a TockyPrepData.
- num_cluster
The number of clusters (metaclusters) to generate via k-means.
- verbose
Logical indicating whether to print progress messages and outputs. Default is
TRUE
.- iter.max
the maximum number of iterations allowed. To be passed to kmeans.
- nstart
the number of random sets to be used in each clustering. To be passed to kmeans.
- mtry
The number of variables randomly sampled as candidates at each split when building a tree within the Random Forest. To be passed to randomForest.
- ntree
The number of trees to grow in the Random Forest. The default value is set to 100. To be passed to randomForest.