Final Defense – Abbas Banaiyanmofrad
May 20, 2015
3pm – 5pm
Donald Bren Hall 3011 Conference Room
Nikil Dutt (Chair), Alex Nicolau, Alex Veidenbaum
Resilient On-Chip Memory Design in the Nano Era
Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of modern chips, including manufacturing defects, wear-out, and parametric variations. By increasing the number, amount, and hierarchy of on-chip memory blocks in emerging computing systems, the reliability of the memory sub-system becomes an increasingly challenging design issue. Existing resilient memory design schemes are unable to effectively address the key features of scalability, interconnect-awareness, and cost-effectiveness for these platforms. In this thesis, we propose different approaches to address resilient on-chip memory design in computing systems ranging from traditional single-core processors to emerging many-core platforms. We classify our proposed approaches in five main categories: 1) Flexible and low-cost approaches to protect cache memories in single-core processors against permanent faults and transient errors, 2) Scalable fault-tolerant approaches to protect last-level caches with non-uniform cache access in chip multiprocessors, 3) Interconnect-aware cache protection schemes in network-on-chip architectures, 4) Application-aware memory resiliency for approximate computing era, and 5) System-level design space exploration, analysis, and optimization for redundancy-aware on-chip memory resiliency in many-core platforms.