--- title: "cytofkit: Quick Start" output: BiocStyle::html_document: toc: true --- # Install and Load the Package To install **cytofkit** package, start R and run the following codes on R console: ```{r, eval=FALSE} source("https://bioconductor.org/biocLite.R") biocLite("cytofkit") ``` `Notes`: **cytofkit** GUI is dependent on XQuartz windowing system (X Windows) on Mac (OS X > 10.7). Install XQuartz from http://xquartz.macosforge.org. * Download the disk image (dmg) file for [XQuartz](https://dl.bintray.com/xquartz/downloads/XQuartz-2.7.9.dmg). * Double-clicking on the file to start the installation, you can do it safely by clicking through all the defaults. * After the installation, you'll have to log out and back on to your Mac OS X account. Load the Package: ```{r, message=FALSE} library("cytofkit") ``` Read the package description: ```{r, eval=FALSE} ?"cytofkit-package" ``` # Options for Using **cytofkit** Package **cytofkit** provides three ways to employ the workforce of this package: ## Run with GUI The easiest way to use **cytofkit** package is through the GUI. The GUI provides all main options of **cytofkit** on a visual interface. To launch the GUI, load the package and type the following command: ```{r, eval=FALSE} cytofkit_GUI() ``` The interface will appear like below, you can click the information button **!** to check the explanation for each entry and customize your own analysis. ![cytofkit GUI](cytofkit_GUI.png) Start your analysis as simply as following: - Choose the input fcs files from the directory where you store FCS data; - Select the markers from the auto-generated marker list; - Choose the directory where to save your output; - Give a project name as a prefix for the names of result files; - Select a data merging method if you have multiple FCS files; - Select your clustering method(s), visualization method(s), progression estimation method(s) Then submit it, that's all. Depends on the size of your data, it will take some time to run the analysis. Once done, a window will pop up, showing you the path where the results have been stored, and asking you if open the shiny web APP. If YES, the shiny APP will be deployed locally and opened in your default web browser. Among the saved results, a special R data object with suffix of **_.RData_** is for loading the results into the shiny APP. Choose the **_.RData_** file on the shiny APP then submit it, your journey of exploring the results starts. ## Run with the Core Function **Cytofkit** provides a core function `cytofkit()` to drive the analysis pipeline of mass cytometry data. Users only need to define several key parameters to start their analysis automatically. One simple example of running cytofkit using the core function is like this: ```{r, eval=FALSE} set.seed(100) dir <- system.file('extdata',package='cytofkit') file <- list.files(dir ,pattern='.fcs$', full=TRUE) parameters <- list.files(dir, pattern='.txt$', full=TRUE) res <- cytofkit(fcsFiles = file, markers = parameters, projectName = 'cytofkit_test', transformMethod = "cytofAsinh", mergeMethod = "ceil", fixedNum = 500, ## set at 500 for faster run dimReductionMethod = "tsne", clusterMethods = c("Rphenograph", "ClusterX"), ## accept multiple methods visualizationMethods = c("tsne", "pca"), ## accept multiple methods progressionMethod = "isomap", clusterSampleSize = 500, resultDir = getwd(), saveResults = TRUE, saveObject = TRUE) ``` You can customize the parameters for your own need, run `?cytofkit` to get information of all the parameters for `cytofkit`. As running with GUI, once the analysis is done, the results will be saved under `resultDir` automatically. ## Run with Commands (Step-by-Step) You can make use of the functions exported from cytofkit to make your analysis more flexible and fit your own need. Here we use a sample data for demo: ### Pre-processing ```{r} ## Loading the FCS data: dir <- system.file('extdata',package='cytofkit') file <- list.files(dir ,pattern='.fcs$', full=TRUE) paraFile <- list.files(dir, pattern='.txt$', full=TRUE) parameters <- as.character(read.table(paraFile, header = TRUE)[,1]) ## File name file ## parameters parameters ``` ```{r} ## Extract the expression matrix with transformation data_transformed <- cytof_exprsExtract(fcsFile = file, comp = FALSE, transformMethod = "cytofAsinh") ## If analysing flow cytometry data, you can set comp to TRUE or ## provide a transformation matrix to apply compensation ## If you have multiple FCS files, expression can be extracted and combined combined_data_transformed <- cytof_exprsMerge(fcsFiles = file, comp=FALSE, transformMethod = "cytofAsinh", mergeMethod = "all") ## change mergeMethod to apply different combination strategy ## Take a look at the extracted expression matrix head(data_transformed[ ,1:3]) ``` ### Cell Subset Detection ```{r, message=FALSE, } ## use clustering algorithm to detect cell subsets ## to speed up our test here, we only use 100 cells data_transformed_1k <- data_transformed[1:100, ] ## run PhenoGraph cluster_PhenoGraph <- cytof_cluster(xdata = data_transformed_1k, method = "Rphenograph") ## run ClusterX data_transformed_1k_tsne <- cytof_dimReduction(data=data_transformed_1k, method = "tsne") cluster_ClusterX <- cytof_cluster(ydata = data_transformed_1k_tsne, method="ClusterX") ``` ```{r, eval=FALSE} ## run DensVM (takes long time, we skip here) cluster_DensVM <- cytof_cluster(xdata = data_transformed_1k, ydata = data_transformed_1k_tsne, method = "DensVM") ``` ```{r, message=FALSE} ## run FlowSOM with cluster number 15 cluster_FlowSOM <- cytof_cluster(xdata = data_transformed_1k, method = "FlowSOM", FlowSOM_k = 12) ## combine data data_1k_all <- cbind(data_transformed_1k, data_transformed_1k_tsne, PhenoGraph = cluster_PhenoGraph, ClusterX=cluster_ClusterX, FlowSOM=cluster_FlowSOM) data_1k_all <- as.data.frame(data_1k_all) ``` ### Cell Subset Visualization and Interpretation ```{r, message=FALSE} ## PhenoGraph plot on tsne cytof_clusterPlot(data=data_1k_all, xlab="tsne_1", ylab="tsne_2", cluster="PhenoGraph", sampleLabel = FALSE) ## PhenoGraph cluster heatmap PhenoGraph_cluster_median <- aggregate(. ~ PhenoGraph, data = data_1k_all, median) cytof_heatmap(PhenoGraph_cluster_median[, 2:37], baseName = "PhenoGraph Cluster Median") ``` ```{r} ## ClusterX plot on tsne cytof_clusterPlot(data=data_1k_all, xlab="tsne_1", ylab="tsne_2", cluster="ClusterX", sampleLabel = FALSE) ## ClusterX cluster heatmap ClusterX_cluster_median <- aggregate(. ~ ClusterX, data = data_1k_all, median) cytof_heatmap(ClusterX_cluster_median[, 2:37], baseName = "ClusterX Cluster Median") ``` ```{r} ## FlowSOM plot on tsne cytof_clusterPlot(data=data_1k_all, xlab="tsne_1", ylab="tsne_2", cluster="FlowSOM", sampleLabel = FALSE) ## FlowSOM cluster heatmap FlowSOM_cluster_median <- aggregate(. ~ FlowSOM, data = data_1k_all, median) cytof_heatmap(FlowSOM_cluster_median[, 2:37], baseName = "FlowSOM Cluster Median") ``` ### Inference of Subset Progression ```{r, message=FALSE} ## Inference of PhenoGraph cluster relatedness PhenoGraph_progression <- cytof_progression(data = data_transformed_1k, cluster = cluster_PhenoGraph, method="isomap", clusterSampleSize = 50, sampleSeed = 5) p_d <- data.frame(PhenoGraph_progression$sampleData, PhenoGraph_progression$progressionData, cluster = PhenoGraph_progression$sampleCluster, check.names = FALSE) ## cluster relatedness plot cytof_clusterPlot(data=p_d, xlab="isomap_1", ylab="isomap_2", cluster="cluster", sampleLabel = FALSE) ## marker expression profile markers <- c("(Sm150)Di", "(Yb173)Di") cytof_colorPlot(data=p_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[1], limits = range(p_d[,1:52])) cytof_colorPlot(data=p_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[2], limits = range(p_d[,1:52])) cytof_progressionPlot(data=p_d, markers=markers, orderCol="isomap_1", clusterCol = "cluster") ``` ```{r, message=FALSE} ## Inference of ClusterX cluster relatedness ClusterX_progression <- cytof_progression(data = data_transformed_1k, cluster = cluster_ClusterX, method="isomap", clusterSampleSize = 30, sampleSeed = 3) c_d <- data.frame(ClusterX_progression$sampleData, ClusterX_progression$progressionData, cluster=ClusterX_progression$sampleCluster, check.names = FALSE) ## cluster relatedness plot cytof_clusterPlot(data=c_d, xlab="isomap_1", ylab="isomap_2", cluster="cluster", sampleLabel = FALSE) ## marker expression profile markers <- c("(Sm150)Di", "(Yb173)Di") cytof_colorPlot(data=c_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[1], limits = range(c_d[,1:52])) cytof_colorPlot(data=c_d, xlab="isomap_1", ylab="isomap_2", zlab = markers[2], limits = range(c_d[,1:52])) cytof_progressionPlot(data=c_d, markers, orderCol="isomap_1", clusterCol = "cluster") ``` ### Post-processing ```{r, eval=FALSE} ## save analysis results to FCS file cytof_addToFCS(data_1k_all, rawFCSdir=dir, analyzedFCSdir="analysed_FCS", transformed_cols = c("tsne_1", "tsne_2"), cluster_cols = c("PhenoGraph", "ClusterX", "FlowSOM")) ``` In addition, expression values for cells in a specified cluster can be extracted using `cytof_clusterMtrx()` ```{r} ## See documentation, this function uses the output of main cytofkit cuntion as its input ?cytof_clusterMtrx ``` ### Get package update news ```{r} cytofkitNews() ``` # Session Information ```{r} sessionInfo() ```