欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    The TAU Performance Technology for Complex Parallel .ppt

    • 资源ID:373357       资源大小:2.35MB        全文页数:35页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    The TAU Performance Technology for Complex Parallel .ppt

    1、The TAU Performance Technology for Complex Parallel Systems (Performance Analysis Bring Your Own Code Workshop, NRL Washington D.C.) Sameer Shende, Allen D. Malony, Robert Bell University of Oregon sameer, malony, bertiecs.uoregon.edu,Outline,Motivation Part I: Instrumentation Part II: Measurement P

    2、art III: Analysis Tools Conclusion,TAU Performance System Framework,Tuning and Analysis Utilities Performance system framework for scalable parallel and distributed high-performance computing Targets a general complex system computation model nodes / contexts / threads Multi-level: system / software

    3、 / parallelism Measurement and analysis abstraction Integrated toolkit for performance instrumentation, measurement, analysis, and visualization Portable, configurable performance profiling/tracing facility Open software approach University of Oregon, LANL, FZJ Germany http:/www.cs.uoregon.edu/resea

    4、rch/paracomp/tau,TAU Performance System Architecture,paraprof,TAU Analysis,Parallel profile analysis pprof parallel profiler with text-based display paraprof Graphical, scalable, parallel profile analysis and display Trace analysis and visualization Trace merging and clock adjustment (if necessary)

    5、Trace format conversion (ALOG, SDDF, VTF, Paraver) Trace visualization using Vampir (Pallas/Intel),Pprof Output (ESMF CoupledFlowSolver),IBM AIX F95, C+, C, MPI Profile - Node - Context - Thread Events - code - MPI,Terminology Example,For routine “int main( )”: Exclusive time 100-20-50-20=10 secs In

    6、clusive time 100 secs Calls 1 call Subrs (no. of child routines called) 3 Inclusive time/call 100secs,int main( ) /* takes 100 secs */f1(); /* takes 20 secs */f2(); /* takes 50 secs */f1(); /* takes 20 secs */* other work */ /* Time can be replaced by counts */,Performance Analysis and Visualization

    7、,Analysis of parallel profile and trace measurement Parallel profile analysis ParaProf Cube Profile Browser (UTK, FZJ) Profile generation from trace data Performance data management framework (PerfDMF) Parallel trace analysis Translation to VTF 3.0 and EPILOG Integration with VNG (Technical Universi

    8、ty of Dresden) Online parallel analysis and visualization,TAUs ParaProf Framework Architecture,Portable, extensible, and scalable tool for profile analysis Try to offer “best of breed” capabilities to analysts Build as profile analysis framework for extensibility,Profile Manager Window,Structured AM

    9、R toolkit (SAMRAI+), LLNL,Paraprof: CoupledFlowApp (ESMF) on 4 Nodes,Paraprof Mean Profile (4 nodes),Individual Node (0) Profile in Paraprof,MPI Routines,Text Profile Window,k-Level Callpath Implementation in TAU,TAU maintains a performance event (routine) callstack Profiled routine (child) looks in

    10、 callstack for parent Previous profiled performance event is the parent A callpath profile structure created first time parent calls TAU records parent in a callgraph map for child String representing k-level callpath used as its key “a( )=b( )=c()” : name for time spent in “c” when called by “b” wh

    11、en “b” is called by “a” Map returns pointer to callpath profile structure k-level callpath is profiled using this profiling data Set environment variable TAU_CALLPATH_DEPTH to depth Build upon TAUs performance mapping technology Measurement is independent of instrumentation Use PROFILECALLPATH to co

    12、nfigure TAU,k-Level Callpath Implementation in TAU,Examining Callpaths,Unique Callpaths,Gprof Style Parent, Routine, Children Display,Clickable Callpath Entities,Paraprof,Tracking I/O on Node 0 in ESMF,Calling Path for MPI_Recv( ),CUBE (UTK, FZJ) Browser Sept. 2004,Using TAU with Vampir (Intel Trace

    13、 Analyzer),Configure TAU with -TRACE option % configure TRACE mpi Execute application % poe CoupledFlowApp procs 4 This generates TAU traces and event descriptors Merge all traces using tau_merge % tau_merge *.trc app.trc Convert traces to Vampir Trace format using tau_convert % tau_convert pv app.t

    14、rc tau.edf app.pv Note: Use vampir instead of pv for multi-threaded traces Load generated trace file in Vampir % vampir app.pv,Global Timeline Display with Parallelism View,Vampir: Zooming In,Vampir: IO on Node 0,Vampir: Communication Matrix Display,Vampir: Calltree View,Summary Chart,TAU Performanc

    15、e System Status,Computing platforms (selected) IBM SP / pSeries, SGI Origin 2K/3K, Cray T3E / SV-1 / X1, HP (Compaq) SC (Tru64), Sun, Hitachi SR8000, NEC SX-5/6, Linux clusters (IA-32/64, Alpha, PPC, PA-RISC, Power, Opteron), Apple (G4/5, OS X), Windows Programming languages C, C+, Fortran 77/90/95,

    16、 HPF, Java, OpenMP, Python Thread libraries pthreads, SGI sproc, Java,Windows, OpenMP Compilers (selected) Intel KAI (KCC, KAP/Pro), PGI, GNU, Fujitsu, Sun, Microsoft, SGI, Cray, IBM (xlc, xlf), Compaq, NEC, Intel,Concluding Remarks,Complex parallel systems and software pose challenging performance

    17、analysis problems that require robust methodologies and tools To build more sophisticated performance tools, existing proven performance technology must be utilized Performance tools must be integrated with software and systems models and technology Performance engineered software Function consisten

    18、tly and coherently in software and system environments TAU performance system offers robust performance technology that can be broadly integrated,Support Acknowledgements,Department of Energy (DOE) Office of Science contracts University of Utah DOE ASCI Level 1 sub-contract DOE ASCI Level 3 (LANL, LLNL) NSF National Young Investigator (NYI) award Research Centre Juelich John von Neumann Institute for Computing Dr. Bernd Mohr Los Alamos National Laboratory,


    注意事项

    本文(The TAU Performance Technology for Complex Parallel .ppt)为本站会员(inwarn120)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开