Cache-Aware Lock-Free Queues for Multiple Producers-.ppt
《Cache-Aware Lock-Free Queues for Multiple Producers-.ppt》由会员分享,可在线阅读,更多相关《Cache-Aware Lock-Free Queues for Multiple Producers-.ppt(42页珍藏版)》请在麦多课文档分享上搜索。
1、Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency,Anders Gidenstam Hkan Sundell Philippas Tsigas,Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers University of Technology,School of business and informatics Unive
2、rsity of Bors,10/10/2018,Anders Gidenstam, University of Bors,2,Outline,Introduction Lock-free synchronization The Problem & Related work The new lock-free queue algorithm Experiments Conclusions,10/10/2018,Anders Gidenstam, University of Bors,3,Synchronization on a shared object,Lock-free synchroni
3、zation Concurrent operations without enforcing mutual exclusion Avoids: Blocking (or busy waiting), convoy effects, priority inversion and risk of deadlock Progress Guarantee At least one operation always makes progress,10/10/2018,Anders Gidenstam, University of Bors,4,Correctness of a concurrent ob
4、ject,Desired semantics of a shared data object Linearizability Herlihy & Wing, 1990 For each operation invocation there must be one single time instant during its duration where the operation appears to take effect. The observed effects should be consistent with a sequential execution of the operati
5、ons in that order.,10/10/2018,Anders Gidenstam, University of Bors,5,Correctness of a concurrent object,Desired semantics of a shared data object Linearizability Herlihy & Wing, 1990 For each operation invocation there must be one single time instant during its duration where the operation appears t
6、o take effect. The observed effects should be consistent with a sequential execution of the operations in that order.,Processes can read/write single memory words Synchronization primitives Built into CPU and memory system Atomic read-modify-write (i.e. a critical section of one instruction) Example
7、s: Compare-and-Swap, Load-Linked / Store-Conditional,System Model,10/10/2018,Anders Gidenstam, University of Bors,6,CPU,CPU,Shared Memory,A process Reads/writes may reach memory out of order Reads of own writes appear in program order Atomic synchronization primitive/instruction Single word Compare-
8、and-Swap Atomic Acts as memory barrier for the process own reads and writes All own reads/writes before are done before All own reads/writes after are done after The affected cache block is held exclusively,System Model: Memory Consistency,10/10/2018,Anders Gidenstam, University of Bors,7,10/10/2018
9、,Anders Gidenstam, University of Bors,8,Outline,Introduction Lock-free synchronization The Problem & Related work The new lock-free queue algorithm Experiments Conclusions,The Problem,Concurrent FIFO queue shared data object Basic operations: enqueue and dequeueDesired Properties Linearizable and Lo
10、ck-free Dynamic size (maximum only limited by available memory) Bounded memory usage (in terms of live contents) Fast on real systems,10/10/2018,Anders Gidenstam, University of Bors,9,B,C,E,D,A,F,G,Tail,Head,Related Work: Lock-free Multi-P/C Queues,Michael & Scott, 1996 Linked-list, one element/node
11、 Global shared head and tail pointers Tsigas & Zhang, 2001 Static circular array of elements Two different NULL values for distinguishing initially empty from dequeued elements Global shared head and tail indices, lazily updated Michael & Scott, 1996 + Elimination Moir, Nussbaum, Shalev & Shavit, 20
12、05 Same as the above + elimination of concurrent pairs of enqueue and dequeue when the queue is near empty Hoffman, Shalev & Shavit, 2007 Baskets queue Linked-list, one element/node Reduces contention between concurrent enqueues after conflict Needs stronger memory management than M&S (SLFRC or Bewa
13、re&Cleanup),10/10/2018,Anders Gidenstam, University of Bors,10,10/10/2018,Anders Gidenstam, University of Bors,11,Outline,Introduction Lock-free synchronization The Problem & Related work The new lock-free queue algorithm Experiments Conclusions,The Algorithm,Basic idea: Cut and unroll the circular
14、array queue Primary synchronization on the elements Compare-And-Swap (NULL1 - Value - NULL2 avoids the ABA problem) Head and tail both move to the right Need an “infinite” array of elementsu,10/10/2018,Anders Gidenstam, University of Bors,12,The Algorithm,Basic idea: Creating an “infinite” array of
15、elements. Divide into blocks of elements, and link them together New empty blocks added as needed Emptied blocks are marked deleted and eventually reclaimed Block fields: Elements, next, (filled, emptied flags), deleted flag. Linked chain of dynamically allocated blocks Lock-free memory management n
16、eeded for safe reclamation! Beware&Cleanup Gidenstam, Papatriantafilou, Sundell & Tsigas, 2009,10/10/2018,Anders Gidenstam, University of Bors,13,globalTailBlock,globalHeadBlock,*,The Algorithm,Thread local storage Last used Head block/index for Enqueue Tail block/index for Dequeue Reduces need to r
17、ead/update global shared variables,10/10/2018,Anders Gidenstam, University of Bors,14,globalTailBlock,globalHeadBlock,*,The Algorithm,Enqueue Find the right block (first via TLS, then TLS-next or globalHeadBlock) Search the block for the first empty element,10/10/2018,Anders Gidenstam, University of
18、 Bors,15,globalTailBlock,globalHeadBlock,scan,*,The Algorithm,10/10/2018,Anders Gidenstam, University of Bors,16,globalTailBlock,globalHeadBlock,Add with CAS,Enqueue Find the right block (first via TLS, then TLS-next or globalHeadBlock) Search the block for the first empty element Update element wit
19、h CAS (Also, the linearization point),*,The Algorithm,Dequeue Find the right block (first via TLS, then TLS-next or globalTailBlock) Search the block for the first valid element,10/10/2018,Anders Gidenstam, University of Bors,17,globalTailBlock,globalHeadBlock,scan,*,The Algorithm,Dequeue Find the r
20、ight block (first via TLS, then TLS-next or globalTailBlock) Search the block for the first valid element Remove with CAS, replace with NULL2 (linearization point),10/10/2018,Anders Gidenstam, University of Bors,18,globalTailBlock,globalHeadBlock,*,Remove with CAS,The Algorithm,Maintaining the chain
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- CACHEAWARELOCKFREEQUEUESFORMULTIPLEPRODUCERSPPT

链接地址:http://www.mydoc123.com/p-379242.html