What is Monocle 3?
Monocle 3 is a scRNA-Seq data analysis toolkit developed by Trapnell lab. It is written in R and introduced the notion of pseudotime. Monocle 3 can help perform three major types of scRNA-data analysis:
Clustering & classification of cells by putative cell types
Assignment of pseudotime to cells and, in turn, construction of single-cell trajectories to examine the development processes.
Differential expression analysis to reveal differential genes between pseudotime ranges or clusters.
This blog post will highlight the general trajectory inference using Monocle 3 within OmicsBox.
Every cell emerges from pre-existing cells by undergoing pre-determined transcriptional events. The developmental history of a differentiated cell from a progenitor cell forms cell lineage(s). Single-cell clusters restrain the cells to defined/isolated cluster states that are non-analogous to native cells. A cell could be present anywhere in the cell lineage(s) or may be present at the decision points of cell lineage(s). Every scRNA-Seq experiment captures millions of such cells undergoing some transcriptional change. And, assuming them to be a part of only isolated clusters provides a limited view. Therefore, identifying such dynamic cell states is vital for determining cell fate decisions.
Trajectory Inference Tools
Computational tools developed for such analysis are distinguished as trajectory inference methods. In fact, they facilitate the determination of cell fate decisions in dynamic transcriptional units called pseudotime. Pseudotime reflects how far a particular cell stands from its progenitors/precursors in transcriptional space. The alignment of cells along the continuum of pseudotime forms a cell trajectory comparable to cell lineages [Figure. 1].
Pseudotime is a latent (unobserved) dimension that measures the cells’ advancement through the transition. Is related to but not necessarily the same as an experimental capture time; instead, it is a proxy for the actual time. Mathematically, it is the geodesic distance of the cells from a single or set of progenitor(s)/root node(s).
To work, Monocle 3 needs a root node or starting point, which is used as the reference point for the trajectory construction. Usually, this reference can be provided in the following ways,
Progenitor Cells: Assuming users have information about the cell-types present in the dataset. Monocle 3 will utilize that information and look for the nearest principal point (a cell) in the trajectory graph with the highest number of progenitor cells as its neighbors. In this scenario, that cell will be assigned pseudo time 0, and the geodesic distance among all other cells is calculated from there.
Using Experimental Time: The most well-suited datasets for trajectory analysis are the ones that involve laboratory capture time. Trajectory inference can be supplemented with experimental real-time to guide the Monocle 3 to look for the starting points. This information has been supplied within the experimental design file, sometimes called a cell-metadata descriptor file in a single cell world. Ideally, the user should select the batches collected/sequenced/cultured/extracted at the beginning of the experiment (according to the experimental design).
A trajectory is mathematically a graph superimposed over a traditional UMAP. Each point in a UMAP corresponds to a cell analyzed in the experiment. Every such cell is a part of a cluster or partition (superclusters). Monocle 3 algorithms can draw a trajectory within a partition among clusters or between clusters of different partitions. This drastically changes the inference but gives the user more control of trajectory inference. The trajectory can be branched (e.g., HSC differentiation), bifurcated (One parent, two daughter cells), cyclic or linear (one parent, one daughter) depending upon the biological system under study.
The trajectory inference aims to determine the pattern of a dynamic process experienced by cells and then arrange cells based on their progression through the process. Once the pseudotime is assigned, differential expression analysis can be performed to elucidate differential gene expression throughout the biological process. This approach gives the possibility of evaluating the genes expressed at branching points.
The user should be well informed about the dataset, as any dataset can be forced for trajectory inference. Therefore, we recommend users intuitively decide whether the dataset contains intermediate cells undergoing some biological process or not. E.g., drawing a trajectory for a group of cells belonging to a developmental biological process will provide more tangible results than cells from different tissues.
Additional insights with OmicxBox
Comprehensive and streamlined workflow
Monocle 3 in OmicsBox offers an end-to-end workflow, which takes the raw count table and experimental design file as input [Figure. 2]. In fact, it gives tabulated and well-formatted output that can be used for further downstream analysis. Results are divided into tables and interactive visualizations, making the curation and evaluation easy for the end-user. Statistical plots such as distribution plots can be produced using side action panel options with few clicks.
Interactive data visualization
One of the significant improvements offered by OmicsBox is the interactive visualization so the assigned pseudotime [Figure 3]. The dynamics of the trajectory can be visualized with interactive image editor tools offered by omics box that not only provide better insights into progression pattern but also gives greater control over the trajectory plot.
Integration with Differential expression wizard
Once the trajectory inference is successfully made, Monocle 3 in OmcisBox offers direct integration with differential expression wizard. Afterward, the raw counts can be a directory utilized to perform differential expressions using the pseudo time ranges of the cluster labels [Figure 4].
Qiu, Xiaojie, et al. “Reversed Graph Embedding Resolves Complex Single-Cell Trajectories.” Nature Methods, vol. 14, no. 10, 21 Aug. 2017, pp. 979–982, 10.1038/nmeth.4402.
Qiu et al.“Single-Cell mRNA Quantification and Differential Analysis with Census.” Nature Methods, vol. 14, no. 3, 23 Jan. 2017, pp. 309–315, 10.1038/nmeth.4150.
Trapnell, Cole, et al. “Pseudo-Temporal Ordering of Individual Cells Reveals Dynamics and Regulators of Cell Fate Decisions.” Nature Biotechnology, vol. 32, no. 4, 1 Apr. 2014, pp. 381–386, 0.1038/nbt.2859.
Reid, John E., and Lorenz Wernisch. “Pseudotime Estimation: Deconfounding Single Cell Time Series.” Bioinformatics, vol. 32, no. 19, 17 June 2016, pp. 2973–2980, 10.1093/bioinformatics/btw372.