At the end of the week I use one of the outcomming table that was also provide by R with the enssemble ID of all genes that were found and change pointed against the logFC, Fold Change, Average Expresssion, T value (outcome from the statistical T-test), P-value, adjusted P-value and the B value for the analysis in PathVisio, this file has to put in for the import of the expression import.
In PathVisio I select the human gene database HS_Derby_20130701.bridge. I also import the wikipathways Homo sapiens Curation-Tutorial (a gpml file).
In pathvisio I can created a visualization. I wanted to show the up- and down regulated geneexpression and the p-value.
At last I did an statistical test for all the pathway. I wanted the pathways to be ranked following the criteria ([logFC]>0.585 OR [logFC]<-0.585) AND [P-value]<0.05.
For the group where the treated group is compared with the group with acute malaria, the first then pathways that follows the criteria are shown.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pathway | positive (r) | measured (n) | total | % | Z Score | p-value (permuted) |
RB in Cancer | 9 | 92 | 104 | 9,78% | 6,02 | 0,001 |
Neurotransmitter uptake and Metabolism In
Glial Cells | 1 | 2 | 13 | 50,00% | 5,28 | 0 |
Transport of Glycerol from Adipocytes to
the Liver by Aquaporins | 1 | 2 | 7 | 50,00% | 5,28 | 0,003 |
Activation of Chaperone Genes by
ATF6-alpha | 2 | 8 | 16 | 25,00% | 5,1 | 0,005 |
Signal amplification | 2 | 11 | 56 | 18,18% | 4,23 | 0,003 |
Thrombin signalling through proteinase
activated receptors (PARs) | 2 | 13 | 53 | 15,38% | 3,82 | 0,005 |
Activation of Matrix Metalloproteinases | 2 | 15 | 66 | 13,33% | 3,48 | 0,009 |
Adipogenesis | 7 | 122 | 132 | 5,74% | 3,47 | 0,008 |
FAS pathway and Stress induction of HSP
regulation | 3 | 35 | 43 | 8,57% | 3,15 | 0,022 |
miR-targeted genes in leukocytes -
TarBase | 6 | 108 | 128 | 5,56% | 3,11 | 0,006 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For the group where the experimental effected group is compared with the baseline group, the first then pathways that follows the criteria are shown.
|
|
|
|
|
|
|
Pathway | positive (r) | measured (n) | total | % | Z Score | p-value (permuted) |
Type II interferon signaling (IFNG) | 5 | 35 | 38 | 14,29% | 10,4 | 0 |
RIG-I/MDA5 mediated induction of
IFN-alpha/beta pathways | 4 | 48 | 181 | 8,33% | 6,88 | 0 |
Heme Biosynthesis | 1 | 8 | 28 | 12,50% | 4,31 | 0,013 |
NOD pathway | 2 | 30 | 43 | 6,67% | 4,26 | 0,004 |
Serotonin Transporter Activity | 1 | 9 | 15 | 11,11% | 4,04 | 0,027 |
Interferon alpha/beta signaling | 2 | 34 | 96 | 5,88% | 3,95 | 0,017 |
Regulation of toll-like receptor
signaling pathway | 4 | 120 | 152 | 3,33% | 3,85 | 0,004 |
Apoptosis | 3 | 80 | 85 | 3,75% | 3,62 | 0,009 |
Quercetin and Nf-kB/ AP-1 induced cell
apoptosis | 1 | 11 | 25 | 9,09% | 3,61 | 0,03 |
TAK1 activates NFkB by phosphorylation
and activation of IKKs complex | 1 | 11 | 30 | 9,09% | 3,61 | 0,035 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
***
positive (r) -- the number of genes on the pathway that fulfill the criterion
meassured (n) -- the number of genes on the pathway that have been measured in the data set
total -- the total number of genes on the pathway
% -- the percentage of measured genes that fulfill the criterion
z-score -- the z-score as computed by a fisher exact test on overrepresentation
p-value (permuted) -- the change
Next week I planned to take a better look at these pathways, and compare these two groups (differences and comparisons), to try to link this with biological reasons.
And to take a look at the gene with a high FC and a significant p-value that is not founded by PathVisio
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Please do put in links to actual data loaded pathways. We do say "never publish a list like this" during courses on pathway analysis for a reason. I understand that for blog posts you can post the list as part of an ongoing process, but the next step really is to look at the pathways themselves. Pathway statistics often does not make a lot of sense on its own. I can give you some reasons for that. Adding a bunch of non regulated genes to pathway for instance lowers its ranking without changing the biology, some pathways highly overlap and they may all show up for that reason while that points to only one sub process really. Also some individual steps in metabolism or regulation can often be done by different genes. Now having one active and differently regulated gene in one step where actually 10 could be active is not necessarily less relevant than having 1 out of 1 for the next one. Having 2 out of 10 could be less meaningful if the next step is not regulated at all. These lists thus are *just* a link to a further interpretation of the outcome.
BeantwoordenVerwijderenI was also wondering why you describe the wanted minimal Fold Change as: | logFC | >0.585 , what you really mean is a real change of 50% up or down, right? Did you try lower values? In large processes many small changes actually can make sense.
BeantwoordenVerwijderen