CytoAnalyst Help

Getting Started

This tutorial provides a quick start for exploring the features of CytoAnalyst using our demo dataset. For more details on each feature and elements in the app, see the corresponding documentation. Check out our Case Studies for a more in-depth look at the analysis process.

Workflow

The diagram below shows what you will perform in this tutorial:

Create a new study

Upload data

Basic data visualization

Perform clustering

Perform sub-clustering

Run differential expression analysis

Compare between clusters

Compare between conditions

Browse and visualize DE results

Perform cell type enrichment analysis

Annotate clusters

Create a New Study

To create a new study, navigate to the Study Management page and fill in the required fields in the Create Study form.

Create Study Form

The Name and Description fields are required. You can modify these details later if needed. Click Create Study to create a new study.

A new study will be created, and added to the study table. You can use this table to manage your studies, including viewing, editing, and sharing them.

Study Table

Upload Data

Use Demo Data

On the study table, hover on the View action of the study you just created. A pop-up will appear with options to navigate the study. Click on the Data Management option to go to the data management page.

Data Upload Page

On the data management page, you can upload your data files. In this tutorial, we will use the demo dataset provided.

Click the Use Demo Data button on the Data Upload form to load the demo dataset and settings. We suggest that the settings for this demo dataset should not be changed for this tutorial.

The demo dataset is an individual single-Cell RNA-seq PBMC data from Guo et al. with 14,783 cells and 17,374 genes. After selecting the demo dataset, the following parameters will be set:

Data Parameters
  • File Type: AnnData (.h5ad). Indicates the file format of the dataset.

  • File: Guo.h5ad. The name of the dataset file.

  • Assay: Default. The type of assay used in the dataset. In this case, it is the default assay.

  • Feature ID Column: feature_name. The column in the dataset that contains the feature IDs. In this case, it is the gene names.

  • Keep Embeddings: True. Indicates whether to keep the precomputed embeddings in the dataset.

  • Embeddings: umap - pca and tsne - pca. The precomputed visualizations and embeddings in the dataset to be kept.

  • Keep Metadata in h5ad file: True. Indicates whether to keep the metadata in the dataset file.

  • Extra Metadata File: Empty. An additional metadata file to be uploaded. In this case, no extra metadata file is uploaded.

  • Has Multiple Samples: True. Indicates whether the dataset has multiple samples.

  • Sample ID Is In: Metadata. Indicates where the sample IDs are located. In this case, they are in the metadata.

  • Sample ID Column: sample. The column in the dataset that contains the sample IDs.

Finally, click the Submit button to start the data preprocessing. A job will be created in the background to process the data. Once the job is completed, the right panel will display the options for data filtering. Visit Study Logs to learn more about monitoring analysis jobs and system status.

Data Filtering

Data Filtering

In the data filtering panel, you can filter cells based on different criteria, including the number of UMI counts, the number of genes expressed, and the percentage of mitochondrial genes.

Check out Data Management for more details on data filtering.

In this tutorial, we will not apply any data filtering as the demo dataset is already preprocessed.

Click Save data to open dialog for saving the data.

Save Data

This dialog allows you to choose what samples and what embeddings to save. In this tutorial, we will save all samples and embeddings.

Click Save data to save the data. The newly saved data will be added to the data table.

Sample Table

Basic Data Visualization

Once the data is saved, you can start exploring the data and perform analysis in the Analysis page. To access the Analysis page, scroll to the top of the page and click on the Analysis button.

The Analysis page provides a comprehensive view of the data and analysis tools. The basic layout of the Analysis page is shown below:

Analysis Page
  • Top Toolbar: Contains dropdown menus for selecting embedding, data normalization, plot type, blending mode, and color map.

  • Left Sidebar: Contains the label selection panel for selecting labels to visualize.

  • Bottom Drawer: Contains all analysis tools

Each of these panels can be expanded or collapsed by clicking on the Collapse Icon icon.

In this tutorial, we will start with basic data visualization using metadata and gene sets. Visit Data Visualization for more details on data visualization.

Visualize one label at a time

To visualize one label at a time, select the Blending mode as Replace on the top toolbar. Next, click on Icon coloring icon to select the label you want to visualize.

Visualize One Label

Visualize multiple labels

To visualize multiple labels, select the Blending mode as Separate on the top toolbar. Next, click on Icon coloring icon to select the labels you want to visualize.

Visualize Multiple Labels

Update color mapping and other settings

Click on the Icon coloring on the top toolbar to update the color mapping and other settings.

Update Color Mapping

Here you will be able to:

  • Update the color mapping

  • Change the plot title

  • Arrange the plot by simply dragging it up or down

  • Change how many plots are displayed in a row

  • Enable and disable tooltips

  • Synchronize the zoom and pan across all plots

  • And more

Visualize genes/features

To visualize a gene or feature, follow these steps:

  • Switch to the Feature tab in the left sidebar

  • Enter the gene name in the search box

  • Click on the gene name to visualize it

Visualize a Gene

Visualize a set of genes/features

To visualize a set of genes or features, we first need to create a gene set collection. Visit Gene Set Collection for more details on creating gene set collections. In this tutorial, we will use an example gene set collection.

Follow these steps to create a gene set collection:

  • Click on the Gene Set Collection tab on the Bottom Drawer

  • Click on New Collection to create a new gene set collection

  • Select Input Type as Text Input

  • Enter a name and the following gene sets in the text box:

B BANK1 CD74 CD79A CD79B HLA-DQA1 IGHM MEF2C MS4A1 RALGPS2 TNFRSF13C CD4 T CD3D CD3G CD4 IL7R LDHB LTB MAL TMSB10 TPT1 TRAC CD8 T CD3D CD3E CD3G CD8A CD8B CTSW HCST LINC02446 TMSB10 TRAC DC CCDC88A CD74 CST3 HLA-DMA HLA-DPA1 HLA-DPB1 HLA-DQA1 HLA-DQB1 HLA-DRA HLA-DRB1 Mono AIF1 CTSS FCN1 LYZ MNDA NEAT1 PSAP S100A9 SERPINA1 TYROBP NK CD247 CST7 FCER1G GNLY GZMB KLRD1 KLRF1 NKG7 PRF1 TYROBP other T CD3D CST7 GZMA GZMK KLRB1 KLRG1 LYAR NKG7 TRDC TRGC2
  • Click Save to save the gene set collection

Create Gene Set Collection

The newly created gene set collection will be added to the Future tab in the left sidebar. Click on the gene set collection to visualize the gene sets as a whole.

Visualize Gene Set Collection

Filter data for individual plots

To filter data for individual plots, simply click on the plot legend to filter the data points for categorical labels, or drag the range slider to filter the data points for continuous labels.

Filter data for all plots

To filter data for all plots, expand the label in the left sidebar and uncheck the data points you want to filter out. For continuous labels, provide the range in the input box or select from the histogram.

Filter Data

Select data points

To select data points:

  • Click on the Select Tool icon on the top toolbar. A selection tool with options will appear.

  • Click and drag on the plot to select data points.

Select Data Points

You can select from multiple plots and the selection will be applied to all plots.

The selection tool is useful when you need to filter cells for cell annotation, create a new embedding, perform differential expression analysis, and more.

Perform Clustering

To perform clustering, click on the Clustering tab on the Bottom Drawer, and then click on the New Clustering button.

Here, we will perform clustering using the Louvain algorithm on the pre-computed embedding with the resolution set to 1.0.

New Clustering

Click Create to start the clustering analysis. Once the job is completed, the clustering results will be added to the Clustering Table and ready for visualization in the Left Sidebar.

Clustering Visualization

To learn more about clustering, visit Clustering Analysis.

Perform Sub-Clustering

CytoAnalyst allows you to perform sub-clustering on any cluster in the clustering results. You can perform sub-clustering on multiple clusters at once.

To perform sub-clustering, you need to:

  1. Create a new embedding for each cluster you want to sub-cluster.

  2. Perform clustering on the new embeddings.

It is not recommended to perform sub-clustering on the global embedding as it may not capture the local structure of the clusters. To learn more about embedding, visit Embedding.

For demonstration purposes, we will perform sub-clustering on one of the clusters. Here we will perform sub-clustering on the Cluster 4 using the KMeans algorithm with the number of clusters set to 2.

Cluster 4

Follow these steps to create a new embedding for Cluster 4:

  • First, create a new embedding for Cluster 4 using the Embedding tab in the Bottom Drawer.

  • Click on the New Embedding button and enter a name for the embedding.

  • Next, filter the data points for Cluster 4 using Clustering Filters in the Cell Filtering panel.

  • Leave the default settings for the embedding method and parameters.

  • Finally, click Create to create the new embedding.

New Embedding for Cluster 4

CytoAnalyst will create a job to create the new embedding. Once the job is completed, the new embedding will be added to the Embedding Table. This embedding can also be used for visualization in the Top Toolbar.

Next, follow these steps to perform sub-clustering on Cluster 4:

  • Click on the Clustering tab in the Bottom Drawer.

  • Click on the New Clustering button.

  • Select the new embedding for Cluster 4 in the Embedding dropdown.

  • Enter a name for the clustering analysis.

  • Choose the KMeans algorithm and set the number of clusters to 2.

  • Click Create to start the sub-clustering analysis.

Sub-Clustering on Cluster 4

Once the job is completed, the sub-clustering results will be added to the Clustering Table, and ready for visualization in the Left Sidebar.

Visualize Clustering Results

You can visualize the clustering results as any other label in the Left Sidebar. In this example, we will learn how to aggregate the clustering and sub-clustering results in a single plot.

First, change the Blending mode to Separate in the Top Toolbar so that each cluster is visualized separately. Next, select the clustering and sub-clustering results in the Left Sidebar to visualize them.

Visualize Clustering Results

This visualization shows the clustering and sub-clustering results side by side. In CytoAnalyst, you can aggregate multiple labels in a single plot by using the Blending mode.

Follow these steps to visualize the clustering and sub-clustering results in a single plot:

  • Click on the Coloring Icon icon on the top toolbar to open the visualization settings.

  • On the table shown in the settings, navigate to the Blending column.

  • Select Aggregate as the blending mode for the sub-clustering result.

Aggregate Clustering Results

The clustering and sub-clustering results will be aggregated in a single plot.

Aggregated Clustering Plot

Please visit Data Visualization for more details on visualizing data in CytoAnalyst.

Perform Cell Type Enrichment Analysis

CytoAnalyst provides multiple tools for enriching cells to help with cell annotation. You can enrich cells using genes from gene set collections, differentially expressed genes, or using pretrained models to calculate cell type scores.

In this example, we will score cells using a pretrained model provided in CytoAnalyst for some certain cell types.

To perform cell enrichment analysis using a pretrained model, follow these steps:

  • Click on the Enrichment tab in the Bottom Drawer.

  • Click on the New Enrichment button.

  • Choose the Using Cell Type Scoring button.

  • Fill in the required fields, such as the name of the enrichment analysis, organism, and cell types.

  • Click Submit to start the enrichment analysis.

Cell Type Enrichment Analysis

Once the job is completed, the cell type enrichment results will be added to the Enrichment Table and ready for visualization in the Left Sidebar.

Cell Type Enrichment Results

Annotate Clusters

Create Annotation Label

In CytoAnalyst, metadata, clusters, and other analysis results are immutable and cannot be directly modified. Instead, you need to create an Annotation Label and make changes to this label.

To create a new Annotation Label, follow these steps:

  • Click on the Cell Annotation tab in the Bottom Drawer.

  • Click on the New Annotation button.

  • Enter a name and default value for the annotation label.

Create Annotation Label

CytoAnalyst allows you to copy values from existing labels to the new annotation label, as well as create annotation by aggregating enrichment results. Check out Cell Annotation for more details. In this tutorial, we will create a new annotation label with all values set to unassigned, and then manually annotate cells.

Once the annotation label is created, you can start visualizing the annotation label in the Left Sidebar.

Visualize Annotation Label

Annotate Cells

To annotate cells, click on the Edit Annotation button in the Cell Annotation panel. In the annotation editor, you can:

  • View the annotation label and values

  • Add other labels (metadata, clustering, etc.) to the annotation editor for reference and filtering

  • Edit the annotation values for selected cells

Edit Annotation

In this tutorial, we will only annotate a few clusters for demonstration purposes. First, we will arrange the visualization to show the annotation label, the clustering results, and some the enrichment results side by side.

Arrange Visualization

Here we visualize 4 labels side by side:

  • Cell type: The newly created annotation label

  • Louvain: The clustering results using the Louvain algorithm

  • plastmablast: The enrichment results for plasma blasts

  • B cells: The enrichment results for B cells

From the score of the enrichment results for plastmablast and B cells, we can infer that:

  • Cluster 12 and 14 are likely to be plasmablasts

  • Cluster 4 is likely to be B cells

For cluster 12 and 14, we will use the filtering tool to select the cells and annotate them as plasmablasts.

Filter Cells

For cluster 4, we will select cells from the visualization and annotate them as B cells.

  • First, click on the Select cells from plot button in the Cell Annotation panel.

  • Next, click and drag on the plot to select the cells.

  • Finally, select cells in the table and annotate them as B cells.

Select Cells

To annotate other clusters and sub-clusters, repeat the same process.

Run Differential Expression Analysis

To run differential expression (DE) analysis, click on the Differential Expression tab in the Bottom Drawer. CytoAnalyst allows you to create DE analysis between any subset of cells. You can also quickly generate DE results between clusters, conditions, or any other labels. To learn more about DE analysis, visit Differential Expression Analysis.

Compare Between Clusters

To compare gene expression between clusters, follow these steps:

  1. Click on the New Differential Analysis button.

  2. Choose the By Cluster option.

  3. Enter a name for the DE analysis. Here {cluster} will be replaced by the cluster name.

  4. Select the comparison mode as With others

  5. Choose the clustering results

  6. Choose the cluster you want to compare with others

  7. Click Submit to start the DE analysis.

DE Analysis by Cluster

In this example, we are creating 18 DE analyses for each cluster against all other clusters.

Compare Between Conditions

In the following example, we will compare gene expression between two conditions: Remission and Severe in the disease stage metadata. However, instead of comparing two big groups of cells, we will compare within the new annotation label we created.

To compare gene expression between conditions, follow these steps:

  1. Click on the By Annotation option.

  2. Enter a name for the DE analysis. Here {annotation} will be replaced by the annotation label name.

  3. Select Within as the comparison mode.

  4. Choose the annotation.

  5. Choose the annotation value you want to compare.

  6. Select disease stage as the metadata field for filtering cell in Group 1

  7. Select remission as the value for Group 1

  8. Select disease stage as the metadata field for filtering cell in Group 2

  9. Select severe as the value for Group 2

  10. Click Submit to start the DE analysis.

DE Analysis by Condition

In this example, we are creating 3 DE analyses for the plasmablasts, B cells, and unassigned cells.

Browse and Visualize DE Results

Once the DE analysis is completed, the results will be added to the Differential Expression Table in the Existing Results tab.

CytoAnalyst allows you to view and visualize multiple DE results at once. To visualize the DE results, simply select the results you want to visualize in the table then click on the View selected button.

DE Results Table

The DE results will be displayed in a separate window.

DE Results Plot

On this window, you can:

  • View all the statistics for the DE results and the volcano plots

  • Select and add genes to the gene set collection based on the DE results

Last modified: 08 July 2025