Asynchronous vs. Synchronous Design Techniques for NoCs.ppt
《Asynchronous vs. Synchronous Design Techniques for NoCs.ppt》由会员分享,可在线阅读,更多相关《Asynchronous vs. Synchronous Design Techniques for NoCs.ppt(67页珍藏版)》请在麦多课文档分享上搜索。
1、Asynchronous vs. Synchronous Design Techniques for NoCs,Robert Mullins,“The Status of the Network-on-Chip Revolution: Design Methods, Architectures and Silicon Implementation”, (Tutorial) International Symposium on System-on-Chip, Tampere, Finland. November 14th, 2005.,2/67,Aims of Tutorial,Highligh
2、t the wide range of system timing alternatives for NoCs Discuss the impact of the choice of timing regime on the architecture of NoC routers Contrast different approaches,3/67,Synchronous to Delay-Insensitive Approaches to System Timing,Synchronous,Delay Insensitive,Global,None,Timing Assumptions,Lo
3、cal Relative,Wire Delay,Less Detection,Sub-System,Local,Isochronic Forks,Multiple clocks,Pausible clocks and locally triggered clock pulses,Bundled Data,Quasi-Delay Insensitive,Local Clocks/ Interaction with data (becoming aperiodic),4/67,System Timing,Approaches to system timing are distinguished b
4、y what delay assumptions they make A number of different approaches to system timing may also be combined: Globally-Asynchronous Locally-Synchronous (GALS) e.g. Synchronous IP interconnected by an asynchronous network,Synchronous On-Chip Networks,6/67,Generic On-Chip Router,7/67,Synchronous Router P
5、ipeline,Router Pipeline may be many stages Increases communication latency Can make packet buffers less effective Incurs pipelining overheads,8/67,Speculative Router Architecture,VC and switch allocation may be performed concurrently: Speculate that waiting packets will be successful in acquiring a
6、VC Prioritize non-speculative requests over speculative ones,Li-Shiuan Peh and William J. Dally, “A Delay Model and Speculative Architecture for Pipelined Routers”, In Proceedings HPCA01, 2001.,9/67,Single Cycle Speculative Router,R. D. Mullins, A. West and S. W. Moore, “Low-Latency Virtual-Channel
7、Routers for On-Chip Networks”, In Proceedings ISCA04.,10/67,Single Cycle Speculative Router,Single cycle router made possible by use of speculation Clock period is almost unchanged (compared to pipelined design) Approx. 30 FO4 (simple standard-cell design) Presence of clock simplifies design Arbitra
8、tion Fast combinational matrix arbiters Can easily be extended to handle priority traffic etc. Speculation Aided by the clear notion of a clock “cycle” Simple abort logic (abort detection and actual abort),11/67,Single Cycle Speculative Router,Lochside Chip (2004) 4x4 mesh network, 25mm2 Single Cycl
9、e Routers (router + link = 1 clock) Low common case latency 4 virtual-channels/input 80-bit links 64-bit data + 16-bit control 250MHz (worst-case PVT) 16Gb/s/channel, 0.18um.,TILE,Traffic Generator, Debug & Test,R,R. D. Mullins, A. West and S. W. Moore, “The design and implementation of a low-latenc
10、y on-chip network”, In Proceedings ASP-DAC06,Beyond a Single Global Clock,13/67,Limitations of Fully-Synchronous Networks,1. Difficult to distribute clock Network spread over die & may have irregular layout Minimising skew costs complexity and power Alternatives/extensions to PLL and H-tree: Clock d
11、eskewing techniques Distributed Clock Generator (DCG). Distributed PLLs Standing-wave oscillators and rotary clock schemes Resonant global clocks, optical clock distribution etc.,14/67,Limitations of Fully-Synchronous Networks,2. Single Network Clock Frequency Communicating synchronous IP blocks may
12、 operate at different and potentially adaptive clock frequencies What is most appropriate network clock frequency? We dont want to have to generate and distribute a very high frequency clock in order to emulate an asynchronous network,15/67,Frequency Distribution,Clock skew may force the system to b
13、e partitioned into multiple clock domains Can exploit the fact that only the phase of each routers clock differs, simple error-free clock-domain crossing possible (single clock source),16/67,Router clocks derived from a single source,Each routers clock may be generated from the global network clock,
14、 either by: Clock division or Clock multiplication Clock domain crossing techniques can exploit known clock frequency relationships,Chakraborty and M. Greenstreet, “Efficient Self-Timed Interfaces for Crossing Clock Domains”, In Proceedings ASYNC03 L. F. G. Sarmenta, G. A. Pratt and S. A. Ward, “Rat
15、ional Clocking”, ICCD95,17/67,Locally Generated Clocks (periodic & free-running),Can exploit knowledge about clocks (when crossing clock domains) even if all we know is that they are periodic, examples: predictive synchronizers DallyFrank/Ginosar asynchronous FIFOs Chakraborty/Greenstreet,18/67,Sync
16、hronous Routers with Asynchronous Links,Synchronization: Time Safe: e.g. Traditional 2 FF synchronizers Value Safe: Clock Pausing/Data-driven clocks,19/67,Locally Clocked Routers/Asynchronous Interconnect (GALS style network),Can support asynchronous interconnects No longer exploiting periodic natur
17、e of router clocks Correct operation is independent of the delay of the link GALS interfaces with pausible clocks If necessary clock is stretched, data is always transferred reliably (value safe) Need to construct local delay line,20/67,GALS Clock Pausing,Simple GALS interface (receiver) Note: Req/A
18、ck uses 2-phase handshaking protocol,21/67,GALS Multiple Inputs,Clock is free running (although it can be paused) It is the clock that really determines if asynchronous data is transferred into the synchronous clock domain on a particular cycle Impact on performance in on-chip network requiring mult
19、iple input data/control ports?,22/67,GALS Stoppable Clock,23/67,Local aperiodic clock generation,Discard free-running clock but retain a single delay assumption for router Options for clock pulse generation: Use stoppable GALS interface and attempt to stop every cycle overheads? Wait for data/null-d
20、ata from all neighbours before generating pulse (global synchrony!) Data driven clock Traditional asynchronous bundled-data approach (with a single delay assumption for whole router) Can still exploit synchronous router implementation,24/67,Data-Driven Local Clock,Idea: If data at any input, sample
21、all inputs Determine which inputs are to be admitted on next clock cycle (requires MUTEX) Ensure data that is not admitted is locked out for next clock cycle After all MUTEXes have made a decision (and never faster than the delay line!) generate a clock pulse Similarities to stoppable GALS interface
22、 and asynchronous priority arbiters,25/67,Data-Driven Clock Waveform,26/67,Data-Driven Clock Waveform,Imagine data from two packets arriving at a single router node at different rates An aperiodic clock may be generated to minimise latency and power Minimum clock period set by delay line Value safe
23、synchronization (no chance data is ever lost),27/67,Data-Driven Local Clock,Updated: June 2006,May be generalized to n-input ports. Only the control interfaces are shown here (r1,a2 and r2,a2) grantn is simply used to control the latching of data at each input port (register enable),28/67,Data-Drive
24、n Local Clock,Simple implementation shown (work in progress) Some small timing constraints Performance tweaks possible Possible Extensions Force synchronization on subset of inputs Some inputs must be present for clock to be generated Generate additional clock pulses to handle pipelining Counter & c
25、lock driven lock signal Select a different clock period (delay line) depending on which inputs have been granted Data-dependent clock period,See also: M. Krstic and E. Grass, “New GALS Technique for Datapath Architectures”, PATMOS 2003. (and ASYNC05 paper),29/67,Clocking alternatives for Synchronous
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ASYNCHRONOUSVSSYNCHRONOUSDESIGNTECHNIQUESFORNOCSPPT

链接地址:http://www.mydoc123.com/p-378680.html