Seminars by CECS

PhD Defense: Ensuring Reliability and Fault-Tolerance for the Cyber-Physical System Design

Name: Volkan Gunes

Date: May 27, 2015

Time: 11:00AM – 12:00PM

Location: Donald Bren Hall 3013 Conference Room

Committee: Tony Givargis (Chair), Alexandru Nicolau, Ian Harris, Steffen Peter


The cyber-physical system (CPS) is a term describing a broad range of
complex, multi-disciplinary, physically-aware next generation engineered
systems that integrate embedded computing technologies (cyber part) into
the physical world. Sensors play an important role in this integration
because they provide the data extracted from the physical world for the
cyber systems to fulfill the decision making process. However, this
process is likely to be misled by incorrect data due to sensor fault

In this dissertation, the main focus is on sensor fault mitigation and
achieving high reliability in CPS operations. One of the challenges we
ponder is timely event (e.g., motion as a phenomenon) detection in CPS
under possible faulty sensor conditions. In this regard, our
demonstrative example of CPS is the falling ball example (FBE) using
binary event detectors (i.e., motion sensors), a controller, and a
camera for timely motion detection of a falling ball. Another challenge
we ponder is satisfying thermal comfort and energy efficiency under
certain faulty sensor conditions in a multi-room building incorporating
temperature sensors, controllers, and heating, ventilation, and air
conditioning (HVAC) systems as a CPS application. For both cases, we
adopt a model-based design (MBD) methodology to analyze the effect of
sensor faults on the desired system outcome. We specify well-defined
fault semantics for the event detectors and temperature sensors to make
the problem definition more clear. We provide a MATLAB/Simulink
simulation framework for our CPS examples. Besides having the
traditional CPS model that comprises the cyber, interface (e.g. sensors
and actuators) and physical models, we develop fault models and a system
evaluation model in Simulink and incorporate them into the CPS model.

We explore various techniques for fault mitigation in a holistic design
perspective. Therefore, the approaches presented in this study
contributes to the design of fault-tolerant CPSs. Furthermore,
considering compute demands of large scale CPSs, we introduce the XGRID
embedded many-core system-on-chip architecture. XGRID makes use of a
novel, FPGA-like, programmable interconnect infrastructure, offering
scalability and deterministic communication using hardware supported
message passing among cores. We provide a conceptual mapping of control
algorithms for the automation of a multi-room building onto target XGRID

Our findings regarding reliable CPS design show that the physical system
attributes (e.g., sensor placement and environmental effects) can be a
more dominant factor than the cyber system attributes on the system
outcome. In addition, sensor faults may lead to unsatisfactory system
outcome in CPSs since CPSs heavily rely on sensor readings for decision
making. Therefore, the analysis of temporal and spatial correlations
between sensor readings helps mitigate certain types of sensor faults
and enable CPSs to utilize sensors’ data more efficiently for decision

PhD Defense: Temperature-Aware Design for SoCs using Thermal Gradient Analysis

Name: Jun Yong Shin

Date: May 18th, 2015

Time: 2:00PM

Location: EH2430, Harut colloquia room

Committee Chair: Nikil Dutt


Over the last few decades, chip performance has increased steadily due to continuous and aggressive technology scaling. However, it leaves chips quite vulnerable to several issues at the same time; high power densities in some particular areas spread across a chip might result in hotspots and thermal gradients, and these can lead to permanent damage to the chip and also can reduce the reliability of the entire system using the chip. As a result, a large number of dynamic thermal management solutions have been proposed in recent years for use in multi-core architectures, and the accurate temperature information over the entire chip area has become indispensable especially for fine-grain dynamic thermal management solutions. Naturally, on-chip thermal sensors came to play an important role in providing the accurate information on the temperature distribution of a chip, but there still remain some issues regarding the allocation of on-chip thermal sensors; due to power, area and routing issues, it is preferable to limit the number of on-chip thermal sensors on a die, and their placement needs to be considered carefully in order to increase the accuracy of full-chip thermal profile reconstruction especially when just a small number of sensors can be implemented; due to the limited reading accuracy of low-power, small-sized on-chip thermal sensors, it would be better to have some way to improve their reading accuracy.

In this work, an issue will be firstly addressed regarding how to improve the reading accuracy of a low-power, small-sized on-chip thermal sensor such as Ring-Oscillator (RO) based sensors at runtime on a software level. Secondly, a question of how to allocate a proper number of sensors on a die in order to get the accurate full-chip scale thermal information on the run is addressed. Additionally, a temperature-aware routing for global interconnects to minimize the delay and also to reduce the probability of chip failure due to electromigration is presented at the end.

PhD Defense: Resilient On-Chip Memory Design in the Nano Era

Final Defense – Abbas Banaiyanmofrad

May 20, 2015
3pm – 5pm
Donald Bren Hall 3011 Conference Room

Nikil Dutt (Chair), Alex Nicolau, Alex Veidenbaum

Resilient On-Chip Memory Design in the Nano Era

Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of modern chips, including manufacturing defects, wear-out, and parametric variations. By increasing the number, amount, and hierarchy of on-chip memory blocks in emerging computing systems, the reliability of the memory sub-system becomes an increasingly challenging design issue. Existing resilient memory design schemes are unable to effectively address the key features of scalability, interconnect-awareness, and cost-effectiveness for these platforms. In this thesis, we propose different approaches to address resilient on-chip memory design in computing systems ranging from traditional single-core processors to emerging many-core platforms. We classify our proposed approaches in five main categories: 1) Flexible and low-cost approaches to protect cache memories in single-core processors against permanent faults and transient errors, 2) Scalable fault-tolerant approaches to protect last-level caches with non-uniform cache access in chip multiprocessors, 3) Interconnect-aware cache protection schemes in network-on-chip architectures, 4) Application-aware memory resiliency for approximate computing era, and 5) System-level design space exploration, analysis, and optimization for redundancy-aware on-chip memory resiliency in many-core platforms. ​

In summary, the premise of this thesis is to provide multiple solutions in different layers of system hierarchy targeting a verity of architectures from embedded single-core microprocessors to emerging large many-core platforms to address cost-efficient error-resiliency of on-chip memory components. 

PhD defense: Self-stabilizing Java: Tool Support for Building Robust Software

Name: Yong hun Eom
Date: April 15, 2015
Time: 10:00am
Location: EH 2430
Committee Chair: Professor Demsky
Committee members: Prof. Pai Chou and Prof. Rainer Doemer

Title: Self-stabilizing Java: Tool Support for Building Robust Software

Developing robust software systems remains an open research problem. The
current approaches for improving software reliability mainly focus on
minimizing the number of software bugs through formal verification or
extensive testing. Despite such efforts, it is common that unexpected
software bugs corrupt a program’s state and cause systems to fail.

The motivation for this research is to embrace the fact that it is
difficult to guarantee that software is error-free. We present
Self-stabilizing Java (SJava) that instead checks that a program
self-stabilizes. Self-stabilizing programs automatically recover to the
correct state from the corrupted state caused by software bugs and other
sources. A number of applications are inherently self-stabilizing—such
programs typically overwrite all non-constant data with new input data.

We have developed a type system and static analyses that together check
whether program executions eventually transition from incorrect states
to the correct state. We combine this with a code-generation strategy
that ensures that a program continues executing long enough to
self-stabilize. Furthermore, in order to lower the burden of type
annotations, we present an annotation inference algorithm that
automatically derives an initial set of annotations.

Our experience using SJava indicates that our system successfully
checked that several benchmarks were self-stabilizing and effectively
inferred annotations for our benchmarks.

PhD Defense: Formal Analysis of Electronic System Level Models using Satisfiability Modulo Theories and Automata Checking

Name:  Che-Wei Chang

Date/Time:  Wednesday, January 28,2015, 9:00am

Location:  EH 3404

Committee Chair: Rainer Doemer

Committee Member: Daniel Gajski

Committee Member: Pai Chou


For a system-level design which may be composed of multiple processing elements running
        in parallel, various kinds of unwanted consequences may happen if the system
        is constructed carelessly. For example, deadlock may happen if improper execution
        order and communication between processing elements is used in the system. Another
        problem which may be caused by the concurrent execution is race condition, as
        shared variables in the system-level model could be accessed by multiple concurrent
        threads in parallel. Those unwanted behaviors definitely have negative influence on
        the functionality of the system. Furthermore, the functionality is not the only concern
        in system design as timing constraints are critical as well. If the system cannot
        finish the job within timing constraints, it is still considered an unwanted design. To
        address this issue, we propose two formal analysis approaches in this dissertation to
        analyze three types of properties we discussed above, which are
        1). liveness,
        2). satisfiability of timing constraint, and
        3). May-Happen-in-Parallel access.
        These two approaches are based on Satisfiability Modulo Theories (SMT) and UPPAAL
        automaton model respectively. We run these two approaches on our in-house
        system models, including a JPEG encoder, MP3 decoder, AMBA AHB and CAN
        bus protocol models. The experimental results show our approaches are capable of
        analyzing those properties meeting our expectation within reasonable analysis time.

Self-interference Cancellation in Full-duplex Wireless Systems

Title:  Self-Interference Cancellation in Full-duplex Wireless Systems

Speaker:  Elsayed Ahmed

Date/Time:  August 28, 2014, 10:00AM

Location:  Engineering Hall 4106

Committee Members:
Ahmed Eltawil (Chair)
Ender Ayanoglu
A. Lee Wsindlehurst

Abstract: Due to the tremendous increase in wireless data traffic, one of the major challenges for future wireless systems is the utilization of the available spectrum to achieve better data rates over limited spectrum. Currently, systems operate in what is termed “Half Duplex Mode,” where they are either transmitting or receiving, but never both using the same temporal and spectral resources. Full-duplex transmission promises to double the spectral efficiency where bidirectional communications is carried out over the same temporal and spectral resources. The main limitation impacting full-duplex transmission is managing the strong self-interference signal imposed by the transmit antenna on the receive antenna within the same transceiver. Several recent publications have demonstrated that the key challenge in practical full-duplex systems is un-cancelled self-interference power caused by a combination of hardware imperfections, especially Radio Frequency (RF) circuits’ impairments. In this thesis, we consider the problem of self-interference cancellation in full-duplex systems. The ultimate goal of this work is to design and build a complete, real-time, full-duplex system that is capable of achieving wireless full-duplex transmission using practical hardware platforms. Since RF circuits’ impairments are shown to have significant impact on the self-interference cancellation performance, first, we present a thorough analysis of the effect of RF impairments on the cancellation performance, with the aim of identifying the main performance limiting factors and bottlenecks. Second, the thesis proposes several impairments mitigation techniques to improve the overall self-interference cancellation capability by mitigating most of the transceiver RF impairments. In addition to impairments mitigation, two novel full-duplex transceiver architectures that achieve significant self-interference cancellation performance are proposed. The performance of the proposed techniques is analytically and experimentally investigated in practical wireless environments. Finally, the proposed self-interference cancellation techniques are used to build a complete full-duplex system with a 90% experimentally proven full-duplex rate improvement compared to half-duplex systems.

Computation Model Based Automatic Design Space Exploration for Heterogeneous Multiprocessor Platforms

By Kyoungwon Kim August, 28, 2014read more »

New Circuit Techniques Enabling Millimeter-Wave and Terahertz Transceivers in Nonoscale Silicon

By Zheng Wang June 2, 2014read more »

Performance-Optimized Terahertz Signal Sources in Silicon

By Pei-Yuan Chiang June 2, 2014read more »

New Perspectives on Designing An Effective Management Policy for Multi-level Cache Hierarchy

By Nam Doung February 25, 2014read more »

Cognitive Power Management and Error Resilient Algorithms for Memory Dominated Wireless Communication Systems

By Muhammad Sayed Khairy Abdelghaffar August, 23, 2013read more »

System Level Approaches for Low Power Wireless Architectures

By Amr Hussien August 1, 2013read more »

Exploiting Master-Slave Bus Architecture and Storage Devices to Enable High-Performance, Low-Power Logging for Sensor Systems

By Eunbae Yoon
June 24, 2013

read more

Out-of-order Parallel Discrete Event Simulation for Electronic System-Level Design

By Weiwei Chen June 6, 2013read more »

Optimizing Program Performance via Similarity, Using Feature-aware and Feature-agnostic Characterization Approaches

By Rosario Cammarota
May 31, 2013

read more