Gene Set Collection
Navigation

To access the Gene Set Collection panel, click on the Genes Collection tab on the Data and Analysis Panel.
What is a Gene Set Collection
In CytoAnalyst, a
Gene Setis a collection of unique genes. AGene Set Collectionis a collection ofGene Sets.You can create multiple
Gene Set Collectionsfor different purposes, such as different levels of cell types, different species, etc.Gene Set Collectionsare used to visualize gene expression for the whole set or to quickly select genes for visualization. It is also used forCell Type AnnotationandGene Set Enrichment Analysis.You can also perform Differential Expression Analysis and extract Differentially Expressed Genes (DEGs) and add them to the
Gene Set Collection. See Differential Expression Analysis for more details.
New Gene Set Collection
To create a new Gene Set Collection, click on the New Collection/New Set button.

CytoAnalyst supports multiple ways to create a new Gene Set Collection:
Upload a file: Upload a file containing gene sets.
Text Input: Manually enter gene sets in the text box.
Manual Input: Manually enter gene sets in the form.
From database: Select gene sets from the database.
Select the appropriate option from the top of the form to create a new Gene Set Collection.
Upload a file

CytoAnalyst supports the following file formats for gene set upload:
.txtor.tsvfile containing gene sets..gmtfile containing gene sets.
The .txt or .tsv file is a gene sets file without gene set description, and the .gmt file is gene sets file with gene set description.
The
.txtor.tsvfile should have the following format:
Each row represents a gene set, where the first column is the gene set name and the following columns are the genes in the set. Each column should be separated by a tab or a comma ,.
The
.gmtfile should have the following format:
Each row represents a gene set, where the first column is the gene set name, the second column is the gene set description, and the following columns are the genes in the set. Each column must be separated by a tab.
Text Input

The text input box allows you to enter gene sets manually. The gene sets should be in the same format as the .gmt file with an exception that the genes can be separated by a comma ,.
Manual Input

The manual input form allows you to enter gene sets manually.
Click on
Add Gene Setto add a new gene set.Click on
Removeto remove a gene set.
For each gene set, enter the gene set name, the gene set description, and the genes in the set. Each gene should be separated by a comma ,.
From database

CytoAnalyst provides a database of gene sets that you can select from.
Click on
Open gene set selection dialogto select gene sets from the database.

In this dialog:
Left panel: Lists all available gene sets in the database. You can search for gene sets by typing in the search box.
Right panel: Shows the selected gene sets.
To prevent mistakes in selecting gene sets. You can add at most 20 gene sets to the collection from the database. To add more than 20 gene sets, repeat the process.
You can transform the genes in the gene sets to uppercase or lowercase by selecting the appropriate option from the dropdown menu.
Preview

After entering the gene sets, click on the
Previewbutton to preview the gene sets.Finally, click on the
Savebutton to save the gene sets.The saved gene sets will be displayed in the
Existing Collectionstab.You will still be able to edit the gene sets in the collection once they are saved.
Add to an existing Gene Set Collection

You can add gene sets to an existing
Gene Set Collectionby clicking on theAdd to Existingbutton.The form will ask you to select the collection you want to add the gene sets to.
Existing Gene Set Collection
In the Existing Collections tab, you can view all the existing Gene Set Collections.
Check the following image for what can be done with an existing Gene Set Collection.

Click + or - to expand or collapse the gene sets in the collection. Once expanded, you can see the details as follows:
A form to update collection name and description.
A form to add new gene sets to the collection.
A table to view the gene sets in the collection. This table:
Shows the gene set name, description, and number of genes in the set.
Allows you to
add or remove gene from the gene set.
remove the gene set from the collection.
change the order of the gene sets in the collection.
rename and change the description of the gene set.
For each gene set, you have the option to use LLM (Large Language Model) to infer the cell types for the gene set. This is useful for annotating the gene sets obtained from differential expression analysis by comparing cells among clusters. Click Infer Cell Type on the gene set to use LLM to infer the cell type.