Gene set analysis
Select input type


[?]
>
Load Data file
[?]
Select type of file

[?]
Load Setup file
[?]

[?]
Select gene level statistics

[?]
Load input gene statistics file
[?]


Select Gene set collection type


[?]
Load Gene to Gene set:
[?]
Select organism
[?]
Affymetrix attribute

[?]
Gene set statistics method
[?]
Adjustment for multiple testing
[?]
P-value calculation method
[?]
Gene set size
Lower limit    Upper limit [?]
No of permutations for P-value calculation

[?]
Email (Optional)
[?]

Run

Adjustment method for multiple testing

Select the method to adjust the gene set p-values for multiple testing.

(Back to General Instructions)

Input data

To load a set of not normalized CEL files as raw data they should be compressed in a .zip archive with a name without spaces. It is also possible to load already normalized data in a text file, in which the first column should be the probeset IDs and correspond to the ones used in the annotation.

(Back to General Instructions)

Contact

Enter your E-mail Id for getting the result in your mail.

(Back to General Instructions)

Select comparison

Click on 'Upload' in order to see the conditions that are assigned to each microarray. You are required to define one condition that is to be compared,

(Back to General Instructions)

GS lower limit

A vector of length two, giving the minimum gene set size (number of member genes) to be kept for the analysis.

(Back to General Instructions)

Gene level statistics

The gene level statistics are simply represented by a one-column dataframe (or a named vector) of numeric values (usually p-values or t-values from a differential expression analysis) with some kind of gene IDs as rownames. You don't need to upload this table if you have already uploaded the microarray and the setup data. The corresponding p-values or t-values gene level statistics file will be generated automatically based on the selection.

(Back to General Instructions)

Gene to geneset

The gene set collection should describe the grouping of genes into gene sets. Using a two-column mapping of all gene to gene set association is a simple way to load custom gene set collections. Note that the gene names in the gene set collection have to match the gene names used for the gene-level statistics.

(Back to General Instructions)

Load data

To load a set of CEL files as raw data they should be compressed in a .zip archive with a name without spaces. It is also possible to load already normalized data in a text file, in which the first column should be the probeset IDs and subsequent columns contain the normalized data for each sample. The first line is a header with the sample names (also used in the setup-file). Avoid using sample names starting with numbers. See the example files under General instructions.

(Back to General Instructions)

Number of permutations for P-value calculation

The number of permutations to use for gene sampling. This number is also used during an internal permutation step in the reporter features method.

(Back to General Instructions)

Input Setup

The setup file should describe the experimental setup assigning each sample to a specific condition. The format of the file should be as follows: The first column should contain the names of the CEL files (or the sample names used in the header of the normalized data file) and additional columns should assign attributes in some category to each array. See the example files under General instructions.

(Back to General Instructions)

P-value calculation method

Select the method to calculate the gene set p-values from the gene set statistics. Gene sampling works for all methods. Using a theoretical null distribution is only possible for Fisher's combined probability test, Stouffer's method, Reporter features.

(Back to General Instructions)

Extract filter

The user can get a list of affymetrix attribute for the selected organism from which one should be selected for generating genetogene sets

(Back to General Instructions)

Gene set analysis

Gene set analysis uses gene-level statistics (e.g. from microarray or RNA-seq analysis) together with a gene set collection (e.g. GO-terms) in order to identify gene sets that are significantly enriched by high-scoring genes (generally differentially expressed genes). Here, different methods for calculating gene set statistics can be choosen.

(Back to General Instructions)

Gene set collection

Either a custom gene set collection can be loaded from a file or Gene Ontology Terms can be used as gene sets.

(Back to General Instructions)

Gene statistics file

This input can be a text tab delimited file of the gene statistics, which when uploaded asks the user to select the corresponding column for different statistics p-value, t-value or foldchange. Select corresponding column for p-value and foldchange, and make t-value as 'NONE' if gene set is to be calculated based on p-value and viceversa. This file can also be generated from microarray differential expression analysis. For example file download click on general instructions.

(Back to General Instructions)

General Instructions

Click "[?]" to display instructions here, or navigate through the links below.

Tool description
Input instructions
Output description

Example files to download

Normalized data
Raw data
setup file
Gene statistics file
Gene to geneset file
Click Load Example microarray data and gene to geneset file
OR
Click Load Genestatistics and gene to geneset file and hit Run

Input Instructions

Two inputs are required. First, gene-level statistics provide one value per gene (e.g. p-value or t-value), additionally the fold-change can also be provided in order to have the information about the direction of change. Raw or pre-normalized microarray can also be used as input so that the tool calculates the gene-level statistics. Second, a gene set collection is needed that assigns genes into one or several functional groups. The gene set collection can be loaded from a file or can generate geneset collection using Gene Ontology (GO).

(Back to General Instructions)

Select input type

Either raw microarray CEL-files or pre-normalized microarray data can be loaded and processed in order to generate the gene-level statistics that are required as input to the gene set analysis. Gene-level statistics generated in other ways or from other platforms (RNA-seq etc) can also be loaded from a text-file.

(Back to General Instructions)

Gene set size

The gene sets will be defined only by the genes that there exists data for. The size limits enables the discarding of too small and too large gene sets from the analysis.

(Back to General Instructions)

Select organism

Select the organism for which the dataset has to be fetched from BioMart

(Back to General Instructions)

Output file description

The output of this tool is a network plot detailing the differentially expressed set of genes functions and their connections, along with a table in Excel format with the number of genes in each gene set, the gene set statistics and their p-values (normal and adjusted).

(Back to General Instructions)

Gene set statistics method

Select the method to calculate the gene set statistics.

(Back to General Instructions)


Online tools powered by piano

Citations
1) Väremo L, Nielsen J, Nookaew I ( 2013) Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods Nucleic Acids Res. Apr;41(8):4378-91
A * indicates that these authors contributed equally to the work.