Name: Jun Yong Shin
Date: May 18th, 2015
Time: 2:00PM
Location: EH2430, Harut colloquia room
Committee Chair: Nikil Dutt
Abstract:
Over the last few decades, chip performance has increased steadily due to continuous and aggressive technology scaling. However, it leaves chips quite vulnerable to several issues at the same time; high power densities in some particular areas spread across a chip might result in hotspots and thermal gradients, and these can lead to permanent damage to the chip and also can reduce the reliability of the entire system using the chip. As a result, a large number of dynamic thermal management solutions have been proposed in recent years for use in multi-core architectures, and the accurate temperature information over the entire chip area has become indispensable especially for fine-grain dynamic thermal management solutions. Naturally, on-chip thermal sensors came to play an important role in providing the accurate information on the temperature distribution of a chip, but there still remain some issues regarding the allocation of on-chip thermal sensors; due to power, area and routing issues, it is preferable to limit the number of on-chip thermal sensors on a die, and their placement needs to be considered carefully in order to increase the accuracy of full-chip thermal profile reconstruction especially when just a small number of sensors can be implemented; due to the limited reading accuracy of low-power, small-sized on-chip thermal sensors, it would be better to have some way to improve their reading accuracy.
In this work, an issue will be firstly addressed regarding how to improve the reading accuracy of a low-power, small-sized on-chip thermal sensor such as Ring-Oscillator (RO) based sensors at runtime on a software level. Secondly, a question of how to allocate a proper number of sensors on a die in order to get the accurate full-chip scale thermal information on the run is addressed. Additionally, a temperature-aware routing for global interconnects to minimize the delay and also to reduce the probability of chip failure due to electromigration is presented at the end.