欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    Algorithmic Analysis of Human DNA Replication Timing from .ppt

    • 资源ID:378181       资源大小:1.08MB        全文页数:36页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    Algorithmic Analysis of Human DNA Replication Timing from .ppt

    1、Algorithmic Analysis of Human DNA Replication Timing from Discrete Microarray Data,Christopher Taylor Gabriel Robins & Anindya Dutta,2,Thesis Statement,The DNA replication timing profile can be reconstructed efficiently and accurately from discrete time points.,(Glossary),3,Presentation Outline,Biol

    2、ogy background Microarray technology Experimental data Challenges Algorithms Research Plans Replication timing Origins Scale up,4,Natural Science DNA is the blueprint for organisms It must be passed on (organism, cell) Engineering Gene therapy Insertion, deletion, modification Cancer is unchecked re

    3、plication,Why Study DNA Replication?,5,. A G G T C G A C A C . . T C C A G C T G T G .,Human genome 3 billion bp Replication rate 1000 bp/min Serial replication 5.7 years 6 to 10 hours (speedup 5000),6,Background,Prokaryotes E. Coli DnaA binds to oriC Eukaryotes ORC S. Cerevisiae (yeast) ARS 11 bp c

    4、onsensus Mapping of origins Human No known consensus Few origins characterized,7,ATGGACTACGGATCAGTAAATCGATTAGGCACCAGATCAAGTACGATCCAGAGTACATAGCATACCATGACTAGATACCTGATGCCTAGTCATTTAGCTAATCCGTGGTCTAGTTCATGCTAGGTCTCATGTATCGTATGGTACTGATCT,GAGTACATAGCATACCATGACTAGACTCATGTATCGTATGGTACTGATCT,Interrogation at

    5、genomic scale Large increase in data Microarray data analysis Array of probes tiles genome,PM probe,Cross-hybridization Repeats not tiled Gaps in genome,Genome Tiling Microarrays,GAGTACATAGCATACCATGACTAGA,MM probe,A,8,Image analysis computes intensity of each array probe,9,The Cell Cycle,Start of S-

    6、phase (0 hour),S-Phase,10,Profiling DNA Replication Timing,Ideal: f(chr, bp) = rtime Isolate DNA replicated in discrete parts of S-phase One cell is not enough Synchronize S-phase entry Apply drugs Release together Synchronization error Label in two hour intervals Allelic Variation mf(chr, bp) = rti

    7、me1, rtime2, ,11,Allelic Variation,Fluorescent in-situ Hybridization (FISH) Replication timing at a given site,0hr,2hr,4hr,6hr,8hr,10hr,0hr,2hr,4hr,6hr,8hr,10hr,Temporally specific replication (TS),Temporally non-specific replication (TNS),11,12,What is the Problem?,Reconstruct a continuous replicat

    8、ion profile Temporally (time points) Spatially (probes) from noisy data Biological experiments Synchronization error Microarray artifacts efficiently Genomic data ( 3 billion bp),13,Initial Analysis,Tiling Analysis Software (TAS) Wilcoxon Rank Sum test in sliding window Assess enrichment of treatmen

    9、t over control,Window slides to get p-value for each probe O(kn) time complexity n = # probes on array k = # probes in a window k scales linearly with window size,14,New Analysis,Thesis Statement (revisited):The DNA replication timing profile can be reconstructed efficiently and accurately from disc

    10、rete time points. Incorporate information from all time points Continuous view of replication timing (TR50) Address temporally non-specific replication Scale up to the whole genome efficiently,15,0 0 1/1 0 0,0,2,4,6,8,10,1/6 1/6 1/3 0 1/3,0,2,4,6,8,10,5,5,Allelic Variation Examples,TR50,TR50,Tempora

    11、lly specific replication,Temporally non-specific replication,Challenge: From distribution of array signal, determine replication category.,16,Temporal Specificity Algorithm,/ Is there evidence that all alleles are replicating together? If (max sum of two adjacent time points 5/6 * total sum) then pr

    12、obe is temporally specific / Is at least one allele replicating apart from the majority? Else If (max sum of two adjacent time points not including the maximum time point 1/3 * total sum) then probe is temporally non-specific / Isolated signal is not strong enough to be an allele. Else probe is temp

    13、orally specific,17,Plotting TR50,8 6 4 2TR50 (hours),33 33.5 34Chromosomal Position (in millions of bp),Smoothed TR50 curve recovers replication pattern Local minima Possible locations of replication origin,18,Segregation Algorithm,Sliding window passes over probes to generate intervals Ratio of TSP

    14、 to TNSP determines temporal specificity Average TR50 determines timing category,19,Research Plan: Profile Generation,Parameters to evaluate: Segregation Algorithm: sliding window size, minimum probe density Join Intervals: minimum interval size,20,Evaluation,Concordance of biological phenomena Segr

    15、egation intervals FISH STR50 local minima Other origin methods Correlation with other biological data Gene density Early replication AT content Late replication Gene expression Early replication Activating acetylation/methylation Early replication Performance on random data Large quantity of TNS rep

    16、lication,21,Research Plan: Replication Origins,Drive DNA replication pattern Smoothed TR50 local minima Cleaned up with new profiles Other biological assays Early labeling fragments Nascent strands Bubble trapping ORC binding,22,Approach and Evaluation,Correlation between methods Consensus sets Moti

    17、f analysis Positional attributes Replication timing Proximity to genes Evaluation is difficult (few validated origins) Agreement between methods Testing proposed correlations Paper in preparation,23,Scaling Up to Whole Genome,Pilot 1% 100% of human genome Algorithms developed with scalability in min

    18、d Incremental update sliding windows Linear time Performance based evaluation If 100% data available Profile multiple runs Else Profile many 1% runs,24,Implementation Details,Java Class representation of proprietary microarray files Algorithms to process raw microarray data Diagnostic tools Perl Scr

    19、ipts to process intermediate and final data Correlations, data transformation, quality assurance R statistical language Smoothing, statistical plots, correlation studies Shell scripts Automated processing of microarray sets,25,Current/Expected Contributions,Algorithms, Software Infrastructure, Analy

    20、sis Probe-by-probe TR50 analysis Temporal Specificity Algorithm Combinatorial analysis of allele locations Segregation Algorithm TNS, Early, Mid, Late replicating areas Used to design validation experiments Smoothed TR50 profile Local minima provide candidate origin set Linear algorithms enable scal

    21、e up Randomness testing,26,Publications,Completed: ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science. 2004 Oct 22; 306(5696):636-40. ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

    22、. Nature. In Press, to appear in June 14, 2007 issue Karnani N., Taylor C., Malhotra A., Dutta A. Pan-S replication patterns and chromosomal domains defined by genome tiling arrays of encode genomic areas. Genome Research. In Press, to appear in June 2007 issue UCSC Browser Tracks: TR50, Smoothed TR

    23、50, Local Minima, Segregation In Progress: Multi-million dollar NIH grant for scale up to full human genome Paper detailing origin methods, correlations, etc.,27,Timeline,Spring 2007 (present to June 20):Implement proposed replication profile generation algorithms Generate new profiles for existing

    24、data and evaluate against FISH Collect new origin sets and continue analysis for paper completion Summer 2007 (June 21 to September 21):Explore correlations of new profiles with other data setsSubmit paper to PSB 2008 based on new method and resultsDevelop random data sets to test profile generation

    25、 algorithms Fall 2007 (September 22 to December 21):Evaluate performance for scale up to whole genomeTie up loose ends and begin writing the dissertation Winter 2007-2008 (December 22 to March 19):Finish dissertation and schedule defense before May 2008,28,Acknowledgements,Advising: Anindya Dutta, G

    26、abriel Robins Biological Experiments: Neerja Karnani, Patrick Boyle, Larry Mesner, Jamie Teer, Hakkyun Kim Collaborative Analysis: Ankit Malhotra Discussions of Analysis: Stefan Bekiranov,29,THE END,30,Why is this work computer science?,Fred Brooks: The Computer Scientist as Toolsmith II “Hitching o

    27、ur research to someone elses driving problems, and solving those problems on the owners terms, leads us to richer computer science research.” Not an incremental improvement Algorithmic techniques and analysis used to solve a problem previously addressed inadequately with a statistical approach that

    28、performed poorly Collaboration outside of engineering disciplines enhances visibility, funding opportunities, and demand for CS work Developed algorithms, time complexity analysis, combinatorial analysis, feedback to experimental design,31,Will this work lead to any CS publications?,The Nature artic

    29、le focused on analysis of the biological data and includes descriptions of some of my algorithms The Genome Research paper and origins paper will also contain writeups of my algorithms and analysis techniques The Pacific Symposium on Biocomputing focuses on algorithms and computational techniques,32

    30、,Isnt your approach too simple?,The approach isnt simple: Combinatorial analysis Temporal specificity algorithm (many iterations) Probewise computation to deal with binding affinity Incremental updating sliding windows Cross-hybridiztion Synchronization error Smoothing Parameterization Linear algori

    31、thms for scale up,33,Cant your algorithm be replaced by a well-known statistical method?,HMMs were used for segregation of intervals Performed poorly in comparison to my algorithm Less accurate categorization of replication intervals Prone to rapid oscillation, producing tiny intervals Parameterizat

    32、ion was difficult Lowess smoothing is a statistical method Parameterization was not easy,34,What are the biggest challenges in this work?,Noise! The data to analyze comes from biological experiments with several sources of noise that compound upon one another Biology I havent had a course in biology

    33、 since 10th grade Microarrays New, evolving technology were still learning to deal with Data size Hundreds of GB of data to process Replicates, failed experiments Algorithms must be efficient,35,What kind of career are you aiming for after graduation, and why?,Teaching Computer Science (Small Colleg

    34、e) I enjoyed learning in my undergraduate curriculum with meaningful interactions with professors I taught Discrete Math at UVa in Fall 02 and Spring 03 Enjoyable, but 60-70 students too large Post-doctoral (Biological Computing) Many opportunities around the world Further exploration of the field,36,How will you know when your work/thesis is done?,Research is never really done, but you have to declare victory at some point The replication profiling algorithms Ive developed already perform quite well I have concrete plans to improve and finalize them,


    注意事项

    本文(Algorithmic Analysis of Human DNA Replication Timing from .ppt)为本站会员(deputyduring120)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开