Name: Ahmed Hemani
Date and Time: June 6, 2018 at 4:00 p.m.
Location: Engineering Hall 2430
Committee: Nikil Dutt (Host)
Abstract:
Norman Jouppi while introducing the record long list of authors of Google’s TPU paper at ISCA 2017 remarked – one needs a village to make a chip. This talk proposes a VLSI design method that holds promise of getting the same work done with just a family. This new VLSI design method is based on two principles, one is to raise the abstraction of physical design to micro-architecture level from the present day boolean level standard cells and the second is to adopt a synchoros VLSI design style that enables composition by abutment to eliminate logic and physical synthesis for the end user.
The proposed method raises the abstraction of physical design to micro-architecture level by adopting coarse grain reconfigurable logic for computation, storage and interconnect. All variations in function, capacity, architecture and degree of parallelism are realised by clustering and configuration of the coarse grain reconfigurable cells that we call as SiLago blocks – Silicon Lego Blocks. The micro-architecture level SiLago blocks replace boolean level standard cells as the atomic building blocks of the VLSI systems. These CGRA fabrics are domain specific – inner modem, outer modem, scratchpad memory, dynamic programming, NOCs, infrastructural elements like RISC system controller, PLL/CGU, RGU, DRAM control etc. Each CGRA fabric is holistically customized for its domain, not just for computation but also for control, interconnect, address generation, local storage etc.
The CGRAs provide an architecturally regular basis for a synchoros VLSI design style. Synchoricity is derived from the Greek word “choros” for space. The way synchronous systems divide time uniformly with clock ticks and enables temporal composition, synchoros systems divide space uniformly with grids and enables spatial composition. All SiLago blocks are synchoros or ratiochoros and bring out all their interconnects to periphery on grid at right place and on right metal layer to enable composition by abutment of valid neighbours.
The net result of adopting the above two principles is that as soon as a design is refined down to micro-architecture level, the dimension and position of every single wire segment and transistor is known with certainty in the 100 million gate design. Unlike standard cells, where the position of standard cells and wires connecting them is left to physical synthesis, in synchoros design style these aspects are parametrically hardened. Global wires like power grid, clocks, resets, NOCs are not synthesised, they emerge as a result of the abutment process. This enables automation of system models down to GDSII to create custom, spatially distributed functional hardware, i.e., ASICs with just a family.
A proof of concept synthesis flow exists for transforming applications, hierarchy of algorithms, to GDSII and a path for system level to GDSII is well defined and is being implemented. System-level implies interacting applications.
Biography:
Ahmed Hemani is Professor in Electronic Systems Design at School of ICT, KTH, Kista, Sweden. His current areas of research interests are massively parallel architectures and design methods and their applications to scientific computing and autonomous embedded systems inspired by brain. In past he has contributed to high-level synthesis – his doctoral thesis was the basis for the first high-level
synthesis product introduced by Cadence called visual architect. He has also pioneered the Networks-on-chip concept and has contributed to clocking and low power architectures and design methods. He has extensively worked in industry including National Semiconductors, ABB, Ericsson, Philips Semiconductors, Newlogic. He has also been part of three start-ups.