欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    Transient Fault Tolerance via Dynamic Process-Level .ppt

    • 资源ID:373452       资源大小:475KB        全文页数:12页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    Transient Fault Tolerance via Dynamic Process-Level .ppt

    1、Transient Fault Tolerance via Dynamic Process-Level Redundancy,Alex Shye, Vijay Janapa Reddi, Tipp Moseley and Daniel A. ConnorsUniversity of Colorado at Boulder Department of Electrical and Computer Engineering DRACO Architecture Research GroupWorkshop on Binary Instrumentation and Applications San

    2、 Jose, CA 10.22.2006,Outline,IntroductionBackground/TerminologySoftware-centric Fault DetectionProcess-Level RedundancyExperimental ResultsConclusion,Introduction,Process technology trends Single transistor error rate expected to stay close to constant Number of transistors is increasing exponential

    3、ly with each generationTransient faults will be a problem for microprocessors!Hardware Approaches Specialized redundant hardware, redundant multi-threading Software Approaches Compiler solutions: instruction duplication, control flow checking Low-cost, flexible alternative but higher overheadGoal: L

    4、everage available hardware parallelism in SMT and CMP machines to improve the performance of software transient fault tolerance,Background/Terminology,Types of transient faults (based upon outcome) Benign Faults Silent Data Corruption (SDC) Detected Unrecoverable Error (DUE) True DUE False DUESphere

    5、 of Replication (SoR) Indicates the scope of fault detection and containment Input Replication Output Comparison,Software-centric Fault Detection,Most previous approaches are hardware-centric Even compiler approaches (e.g. EDDI, SWIFT) Software-centric able to leverage strengths of a software approa

    6、ch Correctness is defined by software output Ability to see larger scope effect of a fault Ignore benign faults,Processor,Cache,Memory,Devices,Application,Libraries,Operating System,Hardware-centric Fault Detection,Software-centric Fault Detection,Software SoR,Hardware SoR,Process-Level Redundancy (

    7、PLR),System Call Emulation Unit Creates redundant processes Barrier synchronize at all system calls Enforces SoR with input replication and output comparison Emulates system calls to guarantee determinism among all processes Detects and recovers from transient faults,App,Libs,App,Libs,App,Libs,SysCa

    8、ll Emulation Unit,Operating System,Watchdog Alarm,Master Processonly processallowed to perform system I/O,Redundant Processesidentical address space,file descriptors, etc.not allowed to performsystem I/O,Watchdog Alarmoccasionally a processwill hangset at beginning of barriersynchronization to ensur

    9、ethat all processes arealive,Enforcing SoR and Determinism,Input Replication All read events: read(), gettimeofday(), getrusage(), etc. Return value from all system callsOutput Comparison All write events: write(), msync(, etc. System call parametersMaintaining Determinism at System Calls Master pro

    10、cess executes system call Redundant processes emulate it Ignore some: rename(), unlink() Execute similar/altered system call Identical address space: mmap() Process-specific data: open(), lseek(),Compare syscall type and cmd line parameters,Write cmd line parameters and syscall type to shmem,read(),

    11、Write resulting file offset and read buffer to shmem,Copy the read buffer from shmem,lseek() to correct file offset,Master Process,Redundant Processes,Barrier,Example of handling a read() system call,Fault Detection and Recovery,PLR supports detection/recovery from multiple faults by increasing numb

    12、er of redundant processes and scaling the majority vote logic,Type of Error,Detection Mechanism,Recovery Mechanism,Experimental Methodology,Use a set of the SPEC2000 benchmarks PLR prototype developed with Pin Intercept system calls to implement PLR Fault Injection Gather an instruction count profil

    13、e Use profile to generate a test case Test case: an instruction and a particular execution of the instruction to fault Run with Pin in JIT mode and use IARG_RETURN_REGS to alter a random bit of the instructions source or destination registers Fault Coverage Use fault injector on test inputs generati

    14、ng 1000 test cases per benchmark specdiff in SPEC2000 harness determines output correctness PLR Performance Run PLR (in Probe mode using Pin Probes) on reference inputs with two redundant processes 4-way SMP machine, each processor is hyper-threaded Use sched_set_affinity() to simulate various hardw

    15、are platforms,Fault Coverage,Watchdog timeout very rare so not shown PLR detects all Incorrect and Failed cases Effectively detects relevant faults and ignores benign faults Floating point correctness question (ex. 168.wupwise, 172.mgrid) Actually different results but tolerable difference for specd

    16、iff,Performance,Performance for single processor (PLR 1x1), 2 SMT processors (PLR 2x1) and 4 way SMP (PLR 4x1) Slowdown for 4-way SMP only 1.26x Should be better on a CMP with faster processor interconnect,Conclusion,Present a different way to use existing general purpose SMT and CMP machines for tr

    17、ansient fault toleranceDifferentiate between hardware-centric and software-centric fault detection models Show how software-centric can be effective in ignoring benign faultsPLR on a 4-way SMP executes with only a 26% slowdown, a 36% improvement over the fastest compiler techniqueFuture Work Implementation in a run-time system allows for dynamically altering amount of fault tolerance Simple PLR model is presented; work on handling interrupts, shared memory, and threads (the tough one),Questions?,


    注意事项

    本文(Transient Fault Tolerance via Dynamic Process-Level .ppt)为本站会员(bowdiet140)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开