BIG Biomedicine and the Foundations of BIG Data Analysis.ppt
《BIG Biomedicine and the Foundations of BIG Data Analysis.ppt》由会员分享,可在线阅读,更多相关《BIG Biomedicine and the Foundations of BIG Data Analysis.ppt(17页珍藏版)》请在麦多课文档分享上搜索。
1、BIG Biomedicine and the Foundations of BIG Data Analysis,Michael W. MahoneyICSI and Dept of Statistics, UC BerkeleyMay 2014(For more info, see: http:/www.stat.berkeley.edu/mmahoney),Insiders vs outsiders views (1 of 2),Ques: Genetics vs molecular biology vs biochemistry vs biophysics: Whats the diff
2、erence?,Insiders vs outsiders views (1 of 2),Ques: Genetics vs molecular biology vs biochemistry vs biophysics: Whats the difference?Answer: Not much, (if you are a “methods” person*)they are all biologyyou get data from any of those areas, ignoring important domain details, and evaluate your method
3、 qua methodyour reviewers evaluate the methods and dont care about the science.*E.g., one who self-identifies as doing data analysis or machine learning or statistics or theory of algorithms or artificial intelligence or .,Insiders vs outsiders views (2 of 2),Ques: Data analysis vs machine learning
4、vs statistics vs theory of algorithms vs artificial intelligence (vs scientific computing vs computational mathematics vs databases .): Whats the difference?,Insiders vs outsiders views (2 of 2),Ques: Data analysis vs machine learning vs statistics vs theory of algorithms vs artificial intelligence
5、(vs scientific computing vs computational mathematics vs databases .): Whats the difference?Answer: Not much, (if you are a “science” person*)they are all just toolsyou get a tool from any of those areas and bury details in a methods sectionyour reviewers evaluate the science and dont care about the
6、 methods.*E.g., one who self identifies as doing genetics or molecular biology or biochemistry or biophysics or .,BIG data? MASSIVE data?,NYT, Feb 11, 2012: “The Age of Big Data” “What is Big Data? A meme and a marketing term, for sure, but also shorthand for advancing trends in technology that open
7、 the door to a new approach to understanding the world and making decisions. ” Why are big data big? Generate data at different places/times and different resolutionsFactor of 10 more data is not just more data, but different data,Thinking about large-scale data,Data generation is modern version of
8、microscope/telescope: See things couldnt see before: e.g., fine-scale movement of people, fine-scale clicks and interests; fine-scale tracking of packages; fine-scale measurements of temperature, chemicals, etc.Those inventions ushered new scientific eras and new understanding of the world and new t
9、echnologies to do stuffEasy things become hard and hard things become easy: Easier to see the other side of universe than bottom of oceanMeans, sums, medians, correlations is easy with small data,Our ability to generate data far exceeds our ability to extract insight from data.,How do we view BIG da
10、ta?,Algorithmic vs. Statistical Perspectives,Computer Scientists Data: are a record of everything that happened. Goal: process the data to find interesting patterns and associations.Methodology: Develop approximation algorithms under different models of data access since the goal is typically comput
11、ationally hard.Statisticians (and Natural Scientists)Data: are a particular random instantiation of an underlying process describing unobserved patterns in the world.Goal: is to extract information about the world from noisy data.Methodology: Make inferences (perhaps about unseen events) by positing
12、 a model that describes the random variability of the data around the deterministic model.,Lambert (2000), Mahoney (2010),Single Nucleotide Polymorphisms: the most common type of genetic variation in the genome across different individuals.They are known locations at the human genome where two alter
13、nate nucleotide bases (alleles) are observed (out of A, C, G, T).,SNPs,individuals, AG CT GT GG CT CC CC CC CC AG AG AG AG AG AA CT AA GG GG CC GG AG CG AC CC AA CC AA GG TT AG CT CG CG CG AT CT CT AG CT AG GG GT GA AG GG TT TT GG TT CC CC CC CC GG AA AG AG AG AA CT AA GG GG CC GG AA GG AA CC AA CC
14、AA GG TT AA TT GG GG GG TT TT CC GG TT GG GG TT GG AA GG TT TT GG TT CC CC CC CC GG AA AG AG AA AG CT AA GG GG CC AG AG CG AC CC AA CC AA GG TT AG CT CG CG CG AT CT CT AG CT AG GG GT GA AG GG TT TT GG TT CC CC CC CC GG AA AG AG AG AA CC GG AA CC CC AG GG CC AC CC AA CG AA GG TT AG CT CG CG CG AT CT
15、CT AG CT AG GT GT GA AG GG TT TT GG TT CC CC CC CC GG AA GG GG GG AA CT AA GG GG CT GG AA CC AC CG AA CC AA GG TT GG CC CG CG CG AT CT CT AG CT AG GG TT GG AA GG TT TT GG TT CC CC CG CC AG AG AG AG AG AA CT AA GG GG CT GG AG CC CC CG AA CC AA GT TT AG CT CG CG CG AT CT CT AG CT AG GG TT GG AA GG TT
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- BIGBIOMEDICINEANDTHEFOUNDATIONSOFBIGDATAANALYSISPPT

链接地址:http://www.mydoc123.com/p-378936.html