A Practical Approach toExploiting Coarse-GrainedPipeline .ppt
《A Practical Approach toExploiting Coarse-GrainedPipeline .ppt》由会员分享,可在线阅读,更多相关《A Practical Approach toExploiting Coarse-GrainedPipeline .ppt(50页珍藏版)》请在麦多课文档分享上搜索。
1、A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs William Thies, Vikram Chandrasekhar, Saman Amarasinghe,Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of TechnologyMICRO 40 December 4, 2007,Legacy Code,310 billion lines of legacy c
2、ode in industry today 60-80% of typical IT budget spent re-engineering legacy code (Source: Gartner Group) Now code must be migrated to multicore machines Current best practice: manual translation,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,P
3、arallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Can we improve compilers by making them more human?,Humanizing Compilers,Current: An Omnipotent Being,New: An Expert Programmer,Ric
4、hard Stallman,First step: change our expectations of correctness,Zeus,Humanizing Compilers,First step: change our expectations of correctness Second step: use compilers differently Option A: Treat them like a programmer Transformations distrusted, subject to test Compiler must examine failures and f
5、ix themOption B: Treat them like a tool Make suggestions to programmer Assist programmers in understanding high-level structure How does this change the problem? Can utilize unsound but useful information In this talk: utilize dynamic analysis,Dynamic Analysis for Extracting Coarse-Grained Paralleli
6、sm from C,Dynamic Analysis for Extracting Coarse-Grained Parallelism from C,Focus on stream programs Audio, video, DSP, networking, and cryptographic processing kernels Regular communication patterns Static analysis complex or intractable Potential aliasing (pointer arithmetic, function pointers, et
7、c.) Heap manipulation (e.g., Huffman tree) Circular buffers (modulo ops) Correlated input parameters Dynamic analysis promising Observe flow of data Very few variations at runtime,Adder,LPF1,LPF2,LPF3,HPF1,HPF2,HPF3,Speaker,AtoD,FMDemod,Scatter,Gather,Dynamic Analysis for Extracting Coarse-Grained P
8、arallelism from C,Focus on stream programs Audio, video, DSP, networking, and cryptographic processing kernels Regular communication patterns Static analysis complex or intractable Potential aliasing (pointer arithmetic, function pointers, etc.) Heap manipulation (e.g., Huffman tree) Circular buffer
9、s (modulo ops) Correlated input parameters Opportunity for dynamic analysis If flow of data is very stable, can infer it with a small sample,Adder,LPF1,LPF2,LPF3,HPF1,HPF2,HPF3,Speaker,AtoD,FMDemod,Scatter,Gather,Overview of Our Approach,Original Program,Annotated Program,Mark Potential Actor Bounda
10、ries,Run Dynamic Analysis,No,Hand Parallelized Program,Auto Parallelized Program,Satisfied with Parallelism?,Yes,Communicate data by hand,Communicate based on trace,test and refine using multiple inputs,MPEG-2 Decoder,Stability of MPEG-2,MPEG-2 Decoder,Stability of MPEG-2,Top 10 YouTube Videos,Stabi
11、lity of MPEG-2 (Within an Execution),Frame,Stability of MPEG-2 (Across Executions),Minimum number of training iterations (frames) needed on each video in order to correctly decode the other videos.,Stability of MPEG-2 (Across Executions),Minimum number of training iterations (frames) needed on each
12、video in order to correctly decode the other videos.,5 frames of training on one video is sufficient to correctly parallelize any other video,Stability of MP3 (Across Executions),Minimum number of training iterations (frames) needed on each track in order to correctly decode the other tracks.,Stabil
13、ity of MP3 (Across Executions),Minimum number of training iterations (frames) needed on each track in order to correctly decode the other tracks.,Stability of MP3 (Across Executions),Layer 1 frames,Minimum number of training iterations (frames) needed on each track in order to correctly decode the o
14、ther tracks.,Stability of MP3 (Across Executions),CRC Error,Minimum number of training iterations (frames) needed on each track in order to correctly decode the other tracks.,Stability of MP3 (Across Executions),Minimum number of training iterations (frames) needed on each track in order to correctl
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- APRACTICALAPPROACHTOEXPLOITINGCOARSEGRAINEDPIPELINEPPT

链接地址:http://www.mydoc123.com/p-373173.html