欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    A Practical Approach toExploiting Coarse-GrainedPipeline .ppt

    • 资源ID:373173       资源大小:1.85MB        全文页数:50页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    A Practical Approach toExploiting Coarse-GrainedPipeline .ppt

    1、A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs William Thies, Vikram Chandrasekhar, Saman Amarasinghe,Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of TechnologyMICRO 40 December 4, 2007,Legacy Code,310 billion lines of legacy c

    2、ode in industry today 60-80% of typical IT budget spent re-engineering legacy code (Source: Gartner Group) Now code must be migrated to multicore machines Current best practice: manual translation,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,P

    3、arallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Parallelization: Man vs. Compiler,Can we improve compilers by making them more human?,Humanizing Compilers,Current: An Omnipotent Being,New: An Expert Programmer,Ric

    4、hard Stallman,First step: change our expectations of correctness,Zeus,Humanizing Compilers,First step: change our expectations of correctness Second step: use compilers differently Option A: Treat them like a programmer Transformations distrusted, subject to test Compiler must examine failures and f

    5、ix themOption B: Treat them like a tool Make suggestions to programmer Assist programmers in understanding high-level structure How does this change the problem? Can utilize unsound but useful information In this talk: utilize dynamic analysis,Dynamic Analysis for Extracting Coarse-Grained Paralleli

    6、sm from C,Dynamic Analysis for Extracting Coarse-Grained Parallelism from C,Focus on stream programs Audio, video, DSP, networking, and cryptographic processing kernels Regular communication patterns Static analysis complex or intractable Potential aliasing (pointer arithmetic, function pointers, et

    7、c.) Heap manipulation (e.g., Huffman tree) Circular buffers (modulo ops) Correlated input parameters Dynamic analysis promising Observe flow of data Very few variations at runtime,Adder,LPF1,LPF2,LPF3,HPF1,HPF2,HPF3,Speaker,AtoD,FMDemod,Scatter,Gather,Dynamic Analysis for Extracting Coarse-Grained P

    8、arallelism from C,Focus on stream programs Audio, video, DSP, networking, and cryptographic processing kernels Regular communication patterns Static analysis complex or intractable Potential aliasing (pointer arithmetic, function pointers, etc.) Heap manipulation (e.g., Huffman tree) Circular buffer

    9、s (modulo ops) Correlated input parameters Opportunity for dynamic analysis If flow of data is very stable, can infer it with a small sample,Adder,LPF1,LPF2,LPF3,HPF1,HPF2,HPF3,Speaker,AtoD,FMDemod,Scatter,Gather,Overview of Our Approach,Original Program,Annotated Program,Mark Potential Actor Bounda

    10、ries,Run Dynamic Analysis,No,Hand Parallelized Program,Auto Parallelized Program,Satisfied with Parallelism?,Yes,Communicate data by hand,Communicate based on trace,test and refine using multiple inputs,MPEG-2 Decoder,Stability of MPEG-2,MPEG-2 Decoder,Stability of MPEG-2,Top 10 YouTube Videos,Stabi

    11、lity of MPEG-2 (Within an Execution),Frame,Stability of MPEG-2 (Across Executions),Minimum number of training iterations (frames) needed on each video in order to correctly decode the other videos.,Stability of MPEG-2 (Across Executions),Minimum number of training iterations (frames) needed on each

    12、video in order to correctly decode the other videos.,5 frames of training on one video is sufficient to correctly parallelize any other video,Stability of MP3 (Across Executions),Minimum number of training iterations (frames) needed on each track in order to correctly decode the other tracks.,Stabil

    13、ity of MP3 (Across Executions),Minimum number of training iterations (frames) needed on each track in order to correctly decode the other tracks.,Stability of MP3 (Across Executions),Layer 1 frames,Minimum number of training iterations (frames) needed on each track in order to correctly decode the o

    14、ther tracks.,Stability of MP3 (Across Executions),CRC Error,Minimum number of training iterations (frames) needed on each track in order to correctly decode the other tracks.,Stability of MP3 (Across Executions),Minimum number of training iterations (frames) needed on each track in order to correctl

    15、y decode the other tracks.,Outline,Analysis Tool Case Studies,Outline,Analysis Tool Case Studies,Annotating Pipeline Parallelism,Programmer indicates potential actor boundaries in a long-running loop,Annotating Pipeline Parallelism,Programmer indicates potential actor boundaries in a long-running lo

    16、op,Annotating Pipeline Parallelism,Programmer indicates potential actor boundaries in a long-running loop,Annotating Pipeline Parallelism,Programmer indicates potential actor boundaries in a long-running loopServes as a fundamental API for pipeline parallelism Comparable to OpenMP for data paralleli

    17、sm Comparable to Threads for task parallelism,Legacy C Code,MP3 Decoding,while (!end_bs(. ,Dynamic Analysis,Legacy C Code,Record Who Produces / Consumes each Location,MP3 Decoding,Huffman () ,Dequantize() ,Antialias() ,Hybrid() ,Polyphase() ,out_fifo() ,while (!end_bs(. ,Build Block Diagram,Mem,Huff

    18、man(),Antialias(),Polyphase(),out_fifo(),Dynamic Analysis,Implemented Using Valgrind,Dequantize(),Hybrid(),Dequantize(),Dequantize(),Hybrid(),Huffman(),Antialias(),Polyphase(),out_fifo(),Exploiting the Parallelism,Stateless stage (data parallel),Antialias(),Polyphase(),Stateful stage (sequential),Hy

    19、brid(),Huffman(),Antialias(),Polyphase(),out_fifo(),Exploiting the Parallelism,Reorder(),Dequantize(),Antialias(),for (i=0; iN; i+) PIPELINE(); Dequantize(); PIPELINE(); . ,Polyphase(),Stateless stage (data parallel),Stateful stage (sequential),Hybrid(),Huffman(),Antialias(),Polyphase(),out_fifo(),E

    20、xploiting the Parallelism,Reorder(),Dequantize(),Antialias(),for (i=0; iN; i+) PIPELINE(N); Dequantize(); PIPELINE(); . ,Polyphase(),Stateless stage (data parallel),Stateful stage (sequential),DequantizeN(),Dequantize1(),Dequantize(),Hybrid(),Huffman(),Antialias(),Polyphase(),out_fifo(),Exploiting t

    21、he Parallelism,Antialias(),for (i=0; iN; i+) PIPELINE(N); Dequantize(); PIPELINE(); . ,Polyphase(),Stateless stage (data parallel),Stateful stage (sequential),Parallel Runtime Environment,Pipeline parallelism requires buffering between stages Two ways to implement buffering: 1. Modify original progr

    22、am to add buffers 2. Wrap original code in virtual execution environment We fork each actor into an independent process, and communicate the recorded variables via pipesRobust in the presence of aliasing Suitable to shared or distributed memory Efficient (7% communication overhead on MP3),Mem,Dequan

    23、tize(),Programmer assistance needed for: - mallocd data- nested loops- reduction vars,pipe,Outline,Analysis Tool Case Studies,Extracted Stream Graphs,Ground Moving Target Indicator (GMTI),Extracted with tool:,From GMTI specification:,Audio and Video Codecs,MP3 Decoder,MPEG-2 Decoder,SPEC Benchmarks,

    24、197.parser,256.bzip2 (compression),256.bzip2 (decompression),456.hmmer,Interactive Parallelization Process,Analysis tool exposed serializing dependences As annotated back-edges in stream graph (main.c:9 fft.c:5) How to deal with serializing dependences? 1. Rewrite code to eliminate dependence, or 2.

    25、 Instruct the tool to ignore the dependence Lesson learned: Many memory dependences can be safely ignored! Allow malloc (or free) to be called in any order (GMTI, hmmer) Allow rand() to be called in any order (hmmer) Ignore dependences on uninitialized memory (parser) Ignore ordering of demand-drive

    26、n buffer expansion (hmmer),Results,On two AMD 270 dual-core processors,Results,Profiled for 10 iterations of training data Ran for complete length of testing data Only observed unsoundness: MP3,How to Improve Soundness?,Revert to sequential version upon seeing new code (fixes MP3) Hardware support M

    27、ondriaan memory protection (Witchel et. al) Versioned memory (used by Bridges et al.) Would provide safe communication, but unsafe parallelism Rigorous testing with maximal code coverage Programmer review,Related Work,Revisiting the Sequential Programming Model for Multi-Core (Bridges et al., yester

    28、day) Same pipeline-parallel decompositions of parser, bzip2 Like commutative annotation, we tell tool to ignore dependences But since we target distributed memory, annotation represents privatization rather than reordering Dynamic analysis for understanding, parallelization Rul et. al (2006) program

    29、mer manages communication Redux (2003) fine-grained dependence visualization Karkowski and Corporaal (1997) focus on data parallelism Inspector/executor for DOACROSS parallelism Rauchwerger (1998) survey,Conclusions,Dynamic analysis can be useful for parallelization Our tool is simple, transparent, and one of the first to extract coarse-grained pipeline parallelism from C programs Primary application: program understanding Secondary application: automatic parallelization Future work in improving soundness, automation,


    注意事项

    本文(A Practical Approach toExploiting Coarse-GrainedPipeline .ppt)为本站会员(花仙子)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开