Skip to contents

Getting Started with GatingTree

This vignette demonstrates how to analyze cytometry data using the GatingTree package. We will walk through the entire workflow, including data loading, preprocessing, creating a FlowObject, data transformation, defining positive/negative thresholds interactively, performing gating tree analysis, and visualizing the results.

Example 1: GatingTree Analysis of Cytometry Data Using R objects as Input Data

1. Loading Libraries and Data

First, load the necessary libraries and download the test data using HDCytoData.

This command downloads a hybrid mass cytometry dataset constructed by spiking a small percentage of Acute Myeloid Leukemia (AML) cells into healthy bone marrow cells (Weber et al., 2019).

2. Preprocessing Data

We convert the raw cytometry data into a format suitable for GatingTree analysis by mapping the experiment metadata and filtering relevant markers.

# Extract experiment information and channel names
experiment_info <- d_SE@metadata$experiment_info
channel_name <- colnames(d_SE)

# Prepare sample definitions
sampledef <- experiment_info[, c("sample_id", "group_id")]
colnames(sampledef) <- c('file','group')
# Filter markers based on specific criteria
marker_info <- as.data.frame(d_SE@colData)
logic <- marker_info$marker_class == 'type' | marker_info$marker_name == 'DNA1'
marker_info <- as.data.frame(marker_info[logic,])
# Extract expression data and adjust column names
exprs <- assay(d_SE)
annotationdf <- as.data.frame(rowData(d_SE))
logic <- colnames(exprs) %in% marker_info$channel_name
data <- exprs[, logic]
colnames(data) <- marker_info$marker_name
colnames(data) <- gsub("-", "", colnames(data))
data <- cbind(data, data.frame(file = annotationdf$sample_id))
data <- as.data.frame(data)
# Define variables excluding 'DNA1' and 'file'
cnlogic <- colnames(data) %in% c("DNA1", "file")
variables <- colnames(data)[!cnlogic]
# Remove unnecessary samples
logic <- grepl(pattern = 'CBF', data$file)
Data <- data[!logic,]
# Define sample definitions (grouping) by sampledef
sampledef <- sampledef[!grepl(pattern = 'CBF', sampledef$group),]

2. Creating a FlowObject and Applying Data Transformation

Create a FlowObject using the prepared data and sample definitions.

# Create FlowObject
x <- CreateFlowObject(Data = Data, sampledef = sampledef, experiment_name = 'AML_sim')

We can display the sample grouping using the showSampleDef function:

##          file   group
## 1  healthy_H1 healthy
## 2  healthy_H2 healthy
## 3  healthy_H3 healthy
## 4  healthy_H4 healthy
## 5  healthy_H5 healthy
## 6       CN_H1      CN
## 7       CN_H2      CN
## 8       CN_H3      CN
## 9       CN_H4      CN
## 10      CN_H5      CN

Next, apply data transformation. A moderated log transformation using the LogData function is recommended to normalize the data:

x <- LogData(x, variables = variables)

4. Determining Positive/Negative Threshold for Markers

Use DefineNegatives to define the negative/positive threshold for each of your variables, activating interactive sessions.

Alternatively, you can import predefined thresholds using import_negative_gate_def:

file_path <- system.file("extdata", "negative_gate_def_AML.csv", package = "GatingTree")
negative_gate_def <- read.csv(file_path)
x <- import_negative_gate_def(x, negative_gate_def)

After defining the negative thresholds, inspect the results by visualizing them using PlotDefineNegatives.

To produce density plots (histograms):

x <- PlotDefineNegatives(x, y_axis_var = 'Density', panel = 4)

Vertical line (red line) indicates the threshold value.

For 2d plots, choose a variable for y-axis:

x <- PlotDefineNegatives(x, y_axis_var = "CD3.logdata", panel = 4)

5. Perform GatingTree Analysis and Visualization

With the data prepared and thresholds defined, perform the GatingTree analysis. Use the createGatingTreeObject function to conduct pathfinding analysis in multidimensional marker space and construct a GatingTree.

x <- createGatingTreeObject(x, maxDepth = 5, min_cell_num=0, expr_group = 'CN', ctrl_group = 'healthy', verbose = FALSE)

Visualize the GatingTree output:

x <- GatingTreeToDF(x)
node <- x@Gating$GatingTreeObject
datatree <- convertToDataTree(node)
graph <- convert_to_diagrammer(datatree, size_factor=1, all_labels = F)

library(DiagrammeR)
render_graph(graph, width = 600, height = 600)
%0 1->2 1->5 1->14 1->16 1->27 1->48 1->49 1->53 1->60 1->73 1->79 1->81 1->89 1->103 1->104 1->124 1->125 1->132 1->139 1->152 1->153 1->169 1->170 1->174 1->182 1->201 1->204 1->205 1->214 1->218 1->228 1->231 2->3 2->4 5->6 5->7 5->8 5->9 5->10 5->11 5->12 5->13 14->15 16->17 16->18 16->19 16->20 16->21 16->22 16->23 16->24 16->25 16->26 27->28 27->29 27->30 27->31 27->32 27->33 27->34 27->35 27->36 27->37 27->38 27->39 27->40 27->41 27->42 27->43 27->44 27->45 27->46 27->47 49->50 49->51 49->52 53->54 53->55 53->56 53->57 53->58 53->59 60->61 60->62 60->63 60->64 60->65 60->66 60->67 60->68 60->69 60->70 60->71 60->72 73->74 73->75 73->76 73->77 73->78 79->80 81->82 81->83 81->84 81->85 81->86 81->87 81->88 89->90 89->91 89->92 89->93 89->94 89->95 89->96 89->97 89->98 89->99 89->100 89->101 89->102 104->105 104->106 104->107 104->108 104->109 104->110 104->111 104->112 104->113 104->114 104->115 104->116 104->117 104->118 104->119 104->120 104->121 104->122 104->123 125->126 125->127 125->128 125->129 125->130 125->131 132->133 132->134 132->135 132->136 132->137 132->138 139->140 139->141 139->142 139->143 139->144 139->145 139->146 139->147 139->148 139->149 139->150 139->151 153->154 153->155 153->156 153->157 153->158 153->159 153->160 153->161 153->162 153->163 153->164 153->165 153->166 153->167 153->168 170->171 170->172 170->173 174->175 174->176 174->177 174->178 174->179 174->180 174->181 182->183 182->184 182->185 182->186 182->187 182->188 182->189 182->190 182->191 182->192 182->193 182->194 182->195 182->196 182->197 182->198 182->199 182->200 201->202 201->203 205->206 205->207 205->208 205->209 205->210 205->211 205->212 205->213 214->215 214->216 214->217 218->219 218->220 218->221 218->222 218->223 218->224 218->225 218->226 218->227 228->229 228->230 231->232 231->233 231->234 231->235 231->236 231->237 231->238 231->239 1 Root 2 CD19+ 3 CD3+ 4 CD64+ 5 CD19- 6 CD34+ 7 CD7+ 8 CD38+ 9 CD117+ 10 CD11b- 11 CD33- 12 CD3- 13 CD64- 14 CD11b+ 15 CD123+ 16 CD11b- 17 CD34+ 18 CD47+ 19 CD7+ 20 CD44+ 21 CD38+ 22 CD117+ 23 CD19- 24 CD15- 25 CD3- 26 CD64- 27 CD34+ 28 CD45+ 29 CD123+ 30 CD47+ 31 CD7+ 32 CD15+ 33 CD44+ 34 CD38+ 35 CD3+ 36 CD117+ 37 CD19- 38 CD11b- 39 CD45- 40 CD123- 41 CD33- 42 CD15- 43 CD3- 44 CD117- 45 HLADR- 46 CD64- 47 CD41- 48 CD34- 49 CD45+ 50 CD34+ 51 CD123+ 52 CD7+ 53 CD45- 54 CD34+ 55 CD7+ 56 CD44+ 57 CD38+ 58 HLADR- 59 CD64- 60 CD123+ 61 CD11b+ 62 CD34+ 63 CD45+ 64 CD47+ 65 CD7+ 66 CD44+ 67 CD38+ 68 CD3+ 69 CD117+ 70 CD41+ 71 HLADR- 72 CD64- 73 CD123- 74 CD34+ 75 CD47+ 76 CD7+ 77 CD38+ 78 CD117+ 79 CD33+ 80 CD117+ 81 CD33- 82 CD34+ 83 CD47+ 84 CD7+ 85 CD38+ 86 CD117+ 87 CD19- 88 CD15- 89 CD47+ 90 CD34+ 91 CD123+ 92 CD7+ 93 CD44+ 94 CD38+ 95 CD117+ 96 CD11b- 97 CD123- 98 CD33- 99 CD15- 100 CD3- 101 CD117- 102 CD41- 103 CD47- 104 CD7+ 105 CD34+ 106 CD45+ 107 CD123+ 108 CD47+ 109 CD15+ 110 CD44+ 111 CD38+ 112 CD117+ 113 CD64+ 114 CD19- 115 CD11b- 116 CD45- 117 CD123- 118 CD33- 119 CD15- 120 CD3- 121 HLADR- 122 CD64- 123 CD41- 124 CD7- 125 CD15+ 126 CD34+ 127 CD7+ 128 CD44+ 129 CD38+ 130 CD117+ 131 CD41+ 132 CD15- 133 CD34+ 134 CD47+ 135 CD7+ 136 CD117+ 137 CD11b- 138 CD33- 139 CD44+ 140 CD34+ 141 CD123+ 142 CD47+ 143 CD7+ 144 CD15+ 145 CD38+ 146 CD117+ 147 CD11b- 148 CD45- 149 CD3- 150 HLADR- 151 CD41- 152 CD44- 153 CD38+ 154 CD34+ 155 CD123+ 156 CD47+ 157 CD7+ 158 CD15+ 159 CD44+ 160 CD117+ 161 CD19- 162 CD11b- 163 CD45- 164 CD123- 165 CD33- 166 HLADR- 167 CD64- 168 CD41- 169 CD38- 170 CD3+ 171 CD19+ 172 CD34+ 173 CD123+ 174 CD3- 175 CD34+ 176 CD47+ 177 CD7+ 178 CD44+ 179 CD117+ 180 CD19- 181 CD11b- 182 CD117+ 183 CD34+ 184 CD123+ 185 CD33+ 186 CD47+ 187 CD7+ 188 CD15+ 189 CD44+ 190 CD38+ 191 CD64+ 192 CD19- 193 CD11b- 194 CD123- 195 CD33- 196 CD15- 197 CD3- 198 HLADR- 199 CD64- 200 CD41- 201 CD117- 202 CD34+ 203 CD47+ 204 HLADR+ 205 HLADR- 206 CD34+ 207 CD123+ 208 CD7+ 209 CD44+ 210 CD38+ 211 CD117+ 212 CD45- 213 CD41- 214 CD64+ 215 CD19+ 216 CD7+ 217 CD117+ 218 CD64- 219 CD34+ 220 CD123+ 221 CD7+ 222 CD38+ 223 CD117+ 224 CD19- 225 CD11b- 226 CD45- 227 CD41- 228 CD41+ 229 CD123+ 230 CD15+ 231 CD41- 232 CD34+ 233 CD47+ 234 CD7+ 235 CD44+ 236 CD38+ 237 CD117+ 238 HLADR- 239 CD64-

If necessary, prune the GatingTree to focus on the most informative nodes:

x <- PruneGatingTree(x, max_entropy = 0.5, min_enrichment=0.5)

Visualize the pruned GatingTree:

pruned_node <- x@Gating$PrunedGatingTreeObject
datatree2 <- convertToDataTree(pruned_node)
graph <- convert_to_diagrammer(datatree2, size_factor=1)
render_graph(graph, width = 600, height = 600)
%0 1->2 1->9 1->12 2->3 2->4 2->5 2->6 2->7 2->8 9->10 9->11 12->13 12->14 12->15 12->16 1 Root 2 CD34+ Enrichment: 0.62 Entropy: 0.72 Avg. %: 10.4% 3 CD47+ Enrichment: 1.30 Entropy: 0.00 Avg. %: 5.4% 4 CD7+ Enrichment: 1.93 Entropy: 0.00 Avg. %: 3.8% 5 CD44+ Enrichment: 1.04 Entropy: 0.39 Avg. %: 6.5% 6 CD38+ Enrichment: 0.97 Entropy: 0.00 Avg. %: 6.7% 7 HLADR- Enrichment: 1.05 Entropy: 0.00 Avg. %: 6.6% 8 CD41- Enrichment: 0.69 Entropy: 0.39 Avg. %: 9.3% 9 CD123+ Enrichment: 0.15 Entropy: 1.00 Avg. %: 8.4% 10 CD47+ Enrichment: 0.59 Entropy: 0.39 Avg. %: 2.9% 11 CD7+ Enrichment: 1.54 Entropy: 0.00 Avg. %: 1.2% 12 CD117+ Enrichment: 0.27 Entropy: 0.97 Avg. %: 12.2% 13 CD34+ Enrichment: 1.18 Entropy: 0.00 Avg. %: 3.6% 14 CD123+ Enrichment: 1.15 Entropy: 0.00 Avg. %: 0.7% 15 CD38+ Enrichment: 1.05 Entropy: 0.00 Avg. %: 3.8% 16 CD3- Enrichment: 0.76 Entropy: 0.39 Avg. %: 4.7%

6. Delta Enrichment Analysis

Finally, assess the impact of adding each marker state to the enrichment score using the PlotDeltaEnrichment function.

##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 181.6968, df = 31, p-value = 0
## 
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2

PlotDeltaEnrichmentPrunedTree further clarifies the impact of important markers using Pruned Gating Tree.

##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 9.625, df = 9, p-value = 0.38
## 
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2