Gene Set Collection
Navigation

To access the Gene Set Collection
panel, click on the Genes Collection
tab on the Data and Analysis Panel
.
What is a Gene Set Collection
In CytoAnalyst, a
Gene Set
is a collection of unique genes. AGene Set Collection
is a collection ofGene Sets
.You can create multiple
Gene Set Collections
for different purposes, such as different levels of cell types, different species, etc.Gene Set Collections
are used to visualize gene expression for the whole set or to quickly select genes for visualization. It is also used forCell Type Annotation
andGene Set Enrichment Analysis
.You can also perform Differential Expression Analysis and extract Differentially Expressed Genes (DEGs) and add them to the
Gene Set Collection
. See Differential Expression Analysis for more details.
New Gene Set Collection
To create a new Gene Set Collection
, click on the New Collection/New Set
button.

CytoAnalyst supports multiple ways to create a new Gene Set Collection
:
Upload a file: Upload a file containing gene sets.
Text Input: Manually enter gene sets in the text box.
Manual Input: Manually enter gene sets in the form.
From database: Select gene sets from the database.
Select the appropriate option from the top of the form to create a new Gene Set Collection
.
Upload a file

CytoAnalyst supports the following file formats for gene set upload:
.txt
or.tsv
file containing gene sets..gmt
file containing gene sets.
The .txt
or .tsv
file is a gene sets file without gene set description, and the .gmt
file is gene sets file with gene set description.
The
.txt
or.tsv
file should have the following format:
Each row represents a gene set, where the first column is the gene set name and the following columns are the genes in the set. Each column should be separated by a tab or a comma ,
.
The
.gmt
file should have the following format:
Each row represents a gene set, where the first column is the gene set name, the second column is the gene set description, and the following columns are the genes in the set. Each column must be separated by a tab.
Text Input

The text input box allows you to enter gene sets manually. The gene sets should be in the same format as the .gmt
file with an exception that the genes can be separated by a comma ,
.
Manual Input

The manual input form allows you to enter gene sets manually.
Click on
Add Gene Set
to add a new gene set.Click on
Remove
to remove a gene set.
For each gene set, enter the gene set name
, the gene set description
, and the genes
in the set. Each gene should be separated by a comma ,
.
From database

CytoAnalyst provides a database of gene sets that you can select from.
Click on
Open gene set selection dialog
to select gene sets from the database.

In this dialog:
Left panel: Lists all available gene sets in the database. You can search for gene sets by typing in the search box.
Right panel: Shows the selected gene sets.
To prevent mistakes in selecting gene sets. You can add at most 20 gene sets to the collection from the database. To add more than 20 gene sets, repeat the process.
You can transform the genes in the gene sets to uppercase or lowercase by selecting the appropriate option from the dropdown menu.
Preview

After entering the gene sets, click on the
Preview
button to preview the gene sets.Finally, click on the
Save
button to save the gene sets.The saved gene sets will be displayed in the
Existing Collections
tab.You will still be able to edit the gene sets in the collection once they are saved.
Add to an existing Gene Set Collection

You can add gene sets to an existing
Gene Set Collection
by clicking on theAdd to Existing
button.The form will ask you to select the collection you want to add the gene sets to.
Existing Gene Set Collection
In the Existing Collections
tab, you can view all the existing Gene Set Collections
.
Check the following image for what can be done with an existing Gene Set Collection
.

Click +
or -
to expand or collapse the gene sets in the collection. Once expanded, you can see the details as follows:
A form to update collection name and description.
A form to add new gene sets to the collection.
A table to view the gene sets in the collection. This table:
Shows the gene set name, description, and number of genes in the set.
Allows you to
add or remove gene from the gene set.
remove the gene set from the collection.
change the order of the gene sets in the collection.
rename and change the description of the gene set.
For each gene set, you have the option to use LLM (Large Language Model) to infer the cell types for the gene set. This is useful for annotating the gene sets obtained from differential expression analysis by comparing cells among clusters. Click Infer Cell Type
on the gene set to use LLM to infer the cell type.