欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    Dynamo- Amazon's Highly Available Key-value .ppt

    • 资源ID:374393       资源大小:2.05MB        全文页数:26页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    Dynamo- Amazon's Highly Available Key-value .ppt

    1、Dynamo: Amazons Highly Available Key-value Store Giuseppe DeCandia et al. A,Jagrut Sharma jagrutshusc.edu CSCI-572 (Prof. Chris Mattmann) 20-Jul-2010,Outline of Talk,Motivation (1) Contribution (1) Context (1) Background (3) Related Work (2) System Architecture (7) Implementation (1) Experiences, Re

    2、sults & Lessons Learnt (4) Conclusion (1) Pros (1) Cons (1) Questions (1),2,Motivation,3,Tens of millions of customers,Tens of thousands of servers,Globally distributed data centers 24 * 7 * 365 operations,Performance,Reliability,Efficiency,Scalability,Financial consequences,Customer Trust,DATA MGMT

    3、,Contribution,Evaluation of how different techniques can be combined to provide a highly-available systemDemonstration of how a consistent storage system (like Dynamo) can be used in production environment with demanding applicationsProvision of tuning methods to meet requirements of production syst

    4、ems with very strict performance demands,4,Context,Amazons e-commerce platform Highly de-centralized Loosely coupled Service-oriented architecture Hundreds of services Millions of components Failure is a way of lifeCritical requirement Always available storageStorage techniques S3 (Amazon Simple Sto

    5、rage Service) Dynamo Highly available and scalable distributed data store for Amazons platform Provides primary-key only interface for selected applications (e.g. shopping cart) Combined multiple, high-performance techniques & algorithms Excellent performance in real-world scenarios,5,Background (1

    6、of 3),E-commerce platform services: Stateless & Stateful Relational Databases an over-kill for stateful lookups by primary key Dynamo: Simple key/value interface Highly available Efficient in resource usage Scalable Each service that uses Dynamo runs its own Dynamo instances Dynamos target applicati

    7、ons: Store small-sized objects (1 MB) Operate with weaker consistency if this gives high availability Simple read-write to a data item uniquely identified by a key No query operations span multiple data items Services use Dynamo to give priority to latency & throughput Amazons SLAs are expressed and

    8、 measured at the 99.9th percentile of the distribution (in contrast to common industry approach of using average, median and expected variance),6,Background (2 of 3),Assumptions About DynamoUsed only by Amazons internal servicesOperation environment is non-hostileThere are no security-related requir

    9、ements (e.g. authentication, authorization)Each service uses its distinct instance of DynamoDynamos initial design targets a scale of up to hundreds of storage hosts,7,Background (3 of 3),SOA of Amazons platform,8,Dynamo Design ConsiderationsConflict resolution between replication & consistency ? Ev

    10、entually consistent data storeWhen to resolve update conflicts ? “always writeable” data storeWho performs conflict resolution? Both data store & application allowedIncremental scalability at node-level Symmetry among nodes Favors decentralization Capable of exploiting infrastructure heterogeneity,R

    11、elated Work (1 of 2),Peer to Peer Systems Tackle problems of data storage and distribution Only support flat namespaces Unstructured P2P: Freenet, Gnutella Search query floods network Structured P2P systems: Pastry, Chord, Oceanstore, PAST Employ globally consistent query routing protocol Bounded nu

    12、mber of hops Maintain local routing tables Provide rich storage services with conflict resolutionDistributed File Systems and Databases Support both flat & hierarchical namespaces Ficus, Coda: high availability at expense of consistency Farsite: high availability and scalability using replication Go

    13、ogle File System: master server, chunkservers Bayou: Distributed RDBMS, disconnected operations Antiquity: Wide-area distributed storage system BigTable: Distributed storage system for structured data,9,Related Work (2 of 2),Dynamo Vs Other SystemsTargeted mainly at apps that need an “always writeab

    14、le” data storeBuilt for an infrastructure within a single administrative domain where all nodes are assumed to be trustedApplications using Dynamo do not require support for hierarchical namespaces or complex relational schemaBuilt for latency sensitive applications that require at least 99.9% of re

    15、ad and write operations to be performed within a few hundred milliseconds.Avoids routing requests through multiple nodes. Hence, similar to a zero-hop Distributed Hash Table.,10,System Architecture (1 of 7),11,List Of Techniques Used By Dynamo & Their Advantages,System Architecture (2 of 7),System I

    16、nterfaceget (key) locates the object replicas associated with key in the storage system Returns a single object/list of objects with conflicting versions + contextput(key, context, object) Determines where the replicas of the object should be placed based on the associated key Writes replicas to dis

    17、kcontext encodes system metadata about object includes additional information (e.g. object version)key, object: considered as an opaque array of bytes MD5 hash (key) - 128-bit identifier, used to determine the storage nodes that are responsible for serving the key,12,System Architecture (3 of 7),Par

    18、titioning AlgorithmProvides mechanism to dynamically partition the data over the set of nodes (i.e. storage hosts) Uses variant of consistent hashing (output range of a hash function is treated as a fixed circular space or ring - largest hash value wraps around to the smallest hash value) Advantage:

    19、 departure or arrival of a node only affects its immediate neighbors Limitation 1: leads to non-uniform data and load distribution Limitation 2: oblivious to heterogeneity in the performance of nodes(single node) - multiple points in the ring i.e. virtual nodes Advantages of virtual nodes: Graceful

    20、handling of failure of a node Easy accommodation of a new node Heterogeneity in physical infrastructure can be exploited,13,System Architecture (4 of 7),14,ReplicationEach data item replicated at N hosts N is configured per-instance Each node is responsible for the region of the ring between it and

    21、its Nth predecessor Preference list: List of nodes responsible for storing a particular keyData VersioningEventual consistency: Allows updates to be propagated to all replicas asynchronously put() may return to caller before update has been applied at all replicas get() may return an object that doe

    22、s not have the latest updates Multiple versions of an object can be present in the system at same time syntactic reconciliation: performed by system semantic reconciliation: performed by client vector clock: (node, counter) pair. Used for capturing causality between different versions of the same ob

    23、ject. One vector clock per version per object.,System Architecture (5 of 7),Execution of get() and put() OperationsAny storage node in Dynamo is eligible to receive client get() and put() operations for any keyClient can select a node using: generic load balancer partition-aware client libraryCoordi

    24、nator: node handing read or write operation typically, first among the top N nodes in the preference listConsistency protocol used to maintain consistency among replicas. Two key configurable values are: R: min. no. of nodes that must participate in a successful read operation W: min. no. of nodes t

    25、hat must participate in a successful write operation R + W N is preferable,15,System Architecture (6 of 7),Handling Failures: Hinted HandoffMechanism to ensure that the read and write operations are not failed due to temporary node or network failures.All read and write operations are performed on t

    26、he first N healthy nodes from the preference list, which may NOT always be the first N nodes encountered while walking the consistent hashing ring. Each object is replicated across multiple data centers, which are connected through high-speed network links.Handling Permanent Failures: Replica Synchr

    27、onizationDynamo implements an anti-entropy protocol to keep replicas synchronized. Uses Merkle trees.Merkle tree: A hash tree where leaves are hashes of the values of individual keys.,16,System Architecture (7 of 7),Membership and Failure DetectionExplicit mechanism available to initiate the additio

    28、n and removal of nodes from a Dynamo ring.To prevent logical partitions, some Dynamo nodes play the role of seed nodes.Seeds: Nodes that are discovered by an external mechanism and known to all nodes.Failure detection of communication done in a purely local manner.Gossip-based distributed failure de

    29、tection and membership protocol,17,Implementation,18,Storage Node,Request Coordination,Membership & Failure Detection,Local Persistence Engine,Pluggable Storage EnginesBerkeley Database (BDB) Transactional Data StoreBDB Java EditionMySQL In-memory buffer with persistent backing storeChosen based on

    30、applications object size distribution,Built on top of event-driven messaging substrateUses Java NIOCoordinator executes client read & write requestsState machines created on nodes serving requests,Each state machine instance handles exactly one client requestState machine contains entire process and

    31、 failure handling logic,Experiences, Results & Lessons Learnt (1 of 4),Main Dynamo Usage PatternsBusiness logic specific reconciliation E.g. Merging different versions of a customers shopping cart Timestamp based reconciliation E.g. Maintaining customers session information High performance read eng

    32、ine E.g. Maintaining product catalog and promotional itemsClient applications can tune parameters to achieve specific objectives: N: Performance no. of hosts a data item is replicated at R: Availability min. no. of participating nodes in a successful read opr W: Durability min. no. of participating

    33、nodes in a successful write opr Commonly used configuration (N,R,W) = (3,2,2)Dynamo exposes data consistency & reconciliation logic to developers Dynamo adopts a full membership model each node is aware of the data hosted by its peers,19,Experiences, Results & Lessons Learnt (2 of 4),Typical SLA of

    34、service using Dynamo: 99.9% of the read and write requests execute within 300 ms Balancing Performance and Durability,20,Average & 99.9th percentile latencies of Dynamos read and write operations during a period of 30 days,Comparison of performance of 99.9th percentile latencies for buffered vs. non

    35、-buffered writes over 24 hours,Experiences, Results & Lessons Learnt (3 of 4),Ensuring Uniform Load Distribution Dynamo uses consistent hashing to partition its key space across its replicas and to ensure uniform load distribution. Node “in-balance”: request load for node deviates from the average l

    36、oad by a value less than a certain threshold. Otherwise, Node “out-of-balance” Imbalance ratio = Nodes out-of-balance / Total Nodes,21,Node imbalance & Workload,Comparison of load distribution efficiency of different strategies,Experiences, Results & Lessons Learnt (4 of 4),Three strategies for load

    37、 distribution T random tokens per node and partition by token value T random tokens per node and equal sized partitions Q/S tokens per node, equal-sized partitions (S= #allnodes, Q= #partitions)Divergent versions of data item (rarely) arise in two scenarios: System is facing failure scenarios (node/

    38、data center/network) Large number of concurrent writers to a single data itemServer-driven coordination: client requests are uniformly assigned to nodes in the ring by a load balancer. Client-driven coordination: client applications use a library to perform request coordination locally.,22,Conclusio

    39、n,Dynamo: Is a highly available and scalable data store Is used for storing state of a number of core services of As e-commerce platform Has provided desired levels of availability and performance and has been successful in handling: Server failures Data center failures Network partitions Is increme

    40、ntally scalable Sacrifices consistency under certain failure scenarios Extensively uses object versioning and application-assisted conflict resolution Allows service owners to: scale up and down based on their current request load customize their storage system to meet desired performance, durabilit

    41、y and consistency SLAs by allowing tuning of N, R, W parameters Combination of decentralized techniques can be combined to provide a single highly-available system.,23,Pros,Excellent description of core distributed systems techniques used in Dynamo: partitioning, replication, versioning, membership,

    42、 failure handling, scaling Liberal use of diagrams, charts and tables to explain concepts Real-world examples have been provided to enable the user to understand and appreciate the theoretical concepts Theoretical and implementation-level differences have been clearly explained Exhaustive list of re

    43、ferences for the interested researcher Well-written paper with logical transition from one topic to the next,24,Cons,Little description of supporting techniques used in Dynamo for: state transfer, concurrency & job scheduling, request marshalling, request routing, system monitoring and alarming Cert

    44、ain problems which are theoretically possible, have not been investigated in detail, since they have not been encountered in production systems. Sophisticated comparison with existing systems has not been provided. For protecting As business interests, certain parts of the system have either not been entirely described or described at a very-high level. Future work and possible extensions have not been mentioned clearly.,25,Questions,26,


    注意事项

    本文(Dynamo- Amazon's Highly Available Key-value .ppt)为本站会员(ownview251)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开