SEGS


Project Name: (optional)

Annotation data:
Molecular Functions
Biological Processes
Cellular Components
KEGG Orthology
Gene interactions

Constraints:
Number of DE genes:
Minimal set size: (min=20)

Output:
Maximal p-value:
Combine p-values: Fisher GSEA PAGE
Report top most enriched gene sets.
Summarize descriptions

Upload:
input file:




Information for the required input data and produced results

Project Name Just a string, its aim is to help you identifying the analysis you are running and distinguishing between several analyses.

Annotation data
User can select which of the availible annotation data can be used for the construction of the new gene sets.

Constraints
User can specify the number of important genes (for example differentially expressed) used by the Fisher exact test, and minimal size of the constructed gene sets.

Output
Select maximal p-value controling the statistical significance of the founded enriched gene sets. If you like to find gene sets that are significantly enriched not by one, but found by more then one tests, you can combine the results in one agregate p-value, by setting weights on the individual test p-values. For example, if we like to find gene sets that have average p-value, on Fisher and PAGE tests, 0.01 , then we set weights Fisher 1.0, GSEA 0.0, PAGE 1.0. With the weights user can control his preferences for enrichment tests. Here we should mention that agregated p-values are not p-values in classical sense, but only used to find gene sets that have small p-value on several tests.
Summirizing the description is done by merging the descriptions of similar gene sets, and removing obsolete general terms. Two gene sets are similar if the size of their symmetric difference is less then 5% of the size of the smaller gene set, and have the same p-value.

Upload
SEGS supports ENTREZ gene identifiers. Any gene not found in the ENTREZ database, will be ignored in the analysis. The file should be sequence of lines of paired numbers separated by comma:

gene_ID_1, weight_1
gene_ID_2, weight_2
...


weight for example can be gene's student t-test score.
Here is an example of one valid input file.

Results
SEGS results are shown in HTML format.
It provide several output files describing the input data (input parameters, initial number of genes and number of genes that have been used in the analysis) and analysis results like results of all enrichment tests (Fisher, GSEA, PAGE) separately and combined by provided weights.
It reports all results for over-expressed and under-expressed genes.

Please remember that running two experiments with identical input data, does not have to give exactly the same results, becouse calculation of the resulting p-values is done by permutation testing.

If you have any technical problems using SEGS, please write an email to [igor DOT trajkovski AT ijs DOT si], by attaching your input file and short explanation of the experienced problems.

#