Analysis of gene expression data(Nominal explanatory .ppt
《Analysis of gene expression data(Nominal explanatory .ppt》由会员分享,可在线阅读,更多相关《Analysis of gene expression data(Nominal explanatory .ppt(68页珍藏版)》请在麦多课文档分享上搜索。
1、Analysis of gene expression data (Nominal explanatory variables),Shyamal D. Peddada Biostatistics Branch National Inst. Environmental Health Sciences (NIH) Research Triangle Park, NC,Outline of the talk,Two types of explanatory variables (“experimental conditions”)Some scientific questions of intere
2、stA brief discussion on false discovery rate (FDR) analysisSome existing statistical methods for analyzing microarray data,Types of explanatory variables,Types of explanatory variables (“experimental conditions”),Nominal variables: No intrinsic order among the levels of the explanatory variable(s).
3、No loss of information if we permuted the labels of the conditions. E.g. Comparison of gene expression of samples from “normal” tissue with those from “tumor” tissue.,Types of explanatory variables (“experimental conditions”),Ordinal/interval variables: Levels of the explanatory variables are ordere
4、d.E.g. Comparison of gene expression of samples from different stages of severity of lessions such as “normal”, “hyperplasia”, “adenoma” and “carcinoma”. (categorically ordered)Time-course/dose-response experiments. (numerically ordered),Focus of this talk: Nominal explanatory variables,Types of mic
5、roarray data,Independent samplesE.g. comparison of gene expression of independent samples drawn from normal patients versus independent samples from tumor patients.Dependent samplesE.g. comparison of gene expression of samples drawn from normal tissues and tumor tissues from the same patient.,Possib
6、le questions of interest,Identify significant “up/down” regulated genes for a given “condition” relative to another “condition” (adjusted for other covariates).Identify genes that discriminate between various “conditions” and predict the “class/condition” of a future observation.Cluster genes accord
7、ing to patterns of expression over “conditions”.Other questions?,Challenges,Small sample size but a large number of genes.Multiple testing Since each microarray has thousands of genes/probes, several thousand hypotheses are being tested. This impacts the overall Type I error rates. Complex dependenc
8、e structure between genes and possibly among samples. Difficult to model and/or account for the underlying dependence structures among genes.,Multiple Testing: Type I Errors - False Discovery Rates ,The Decision Table,The only observable values,Strong and weak control of type I error rates,Strong co
9、ntrol: control type I error rate under any combination of true Weak control: control type I error rate only when all null hypotheses are trueSince we do not know a priori which hypotheses are true, we will focus on strong control of type I error rate.,Consequences of multiple testing,Suppose we test
10、 each hypothesis at 5% level of significance. Suppose n = 10 independent tests performed. Then the probability of declaring at least 1 of the 10 tests significant is 1 0.9510 = 0.401.If 50,000 independent tests are performed as in Affymetrix microarray data then you should expect 2500 false positive
11、s!,Types of errors in the context of multiple testing,Per-Family Error “Rate” (PFER): E(V )Expected number of false rejection ofPer-Comparison Error Rate (PCER): E(V )/mExpected proportion of false rejections of among all m hypotheses.Family-Wise Error Rate (FWER): P( V 0 )Probability of at least on
12、e false rejection of among all m hypotheses,Types of errors in the context of multiple testing,False Discovery Rate (FDR):Expected proportion of Type I errors among all rejected hypotheses. Benjamini-Hochberg (BH): Set V/R = 0 if R = 0. Storey: Only interested in the case R 0. (Positive FDR),Some us
13、eful inequalities,Some useful inequalities,Some useful inequalities,Conclusion,It is conservative to control FWER rather than FDR! It is conservative to control pFDR rather than FDR!,Some useful inequalities,Some useful inequalities,Some useful inequalities,Some useful inequalities,However, in most
14、applications such as microarrays, one expects In general, there is no proof of the statement,q-vlaues versus p-values.,Supposeand suppose we are interested in a one-sided test.Suppose is the value of the test stat. for a given data set.,q-vlaues versus p-values.,The pFDR can be rewritten as Suppose
15、is the value of the test stat. for a given data set. Then the q-value is the posterior-Bayesian p-value,Some popular Type I error controlling procedures,Let denote the ordered p-values for the m tests that are being performed.Let denote the ordered levels of significance used for testing the m null
16、hypotheses, respectively.,Some popular controlling procedures,Step-down procedure:,Some popular controlling procedures,Step up procedure:,Some popular controlling procedures,Single-step procedureA stepwise procedure with critical same critical constant for all m hypotheses.,Some typical stepwise pro
17、cedures: FWER controlling procedures,Bonferroni: A single-step procedure withSidak: A single-step procedure withHolm: A step-down procedure with Hochberg: A step-up procedure withminP method: A resampling-based single-step procedure with where be the quantile of the distribution of the minimum p-val
18、ue.,Comments on the methods,Bonferroni: Very general but can be too conservative for large number of hypotheses.Sidak: More powerful than Bonferroni, but applicable when the test statistics are independent or have certain types of positive dependence.,Comments on the methods,Holm: More powerful than
19、 Bonferroni and is applicable for any type of dependence structure between test statistics.Hochberg: More powerful than Holms procedure but the test statistics should be either independent or the test statistic have a MTP2 property.,Comments on the methods,Multivariate Total Positivity of Order 2 (M
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ANALYSISOFGENEEXPRESSIONDATANOMINALEXPLANATORYPPT

链接地址:http://www.mydoc123.com/p-378350.html