Seminars by CECS

Data-Driven Modeling of Cyber-Physical Systems using Side-Channel Analysis

Title: “Data-Driven Modeling of Cyber-Physical Systems using Side-Channel Analysis”

Name: Sujit Rokka Chhetri

Date: May 20, 2019

Time: 3:00 PM

Location: EH 5200

Committee: Prof. Mohammad Al Faruque (Chair), Prof. Pramod Khargonekar, Prof. Fadi Kurdahi


Cyber-Physical System consists of the integration of computational components in the cyber-domain with the physical-domain processes. In cyber-domain, embedded computers and networks monitor and control the physical processes, and in the physical-domain the sensors and actuators aid in interacting with the physical world. This interaction between the cyber and physical domain brings unique modeling challenges one of which includes the integration of discrete models in cyber-domain with the continuous physical domain processes. However, the same cyber-physical interaction also opens new opportunities for modeling. For example, the information flow in the cyber-domain manifests physically in the form of energy flows in the physical domain. Some of these energy flows unintentionally provide information about the cyber-domain through the side-channels.

In this thesis, the extensive analysis of the side-channels (such as acoustic, magnetic, thermal, power and vibration) of the cyber-physical system is performed. Based on this analysis data-driven models are estimated. These models are then used to perform security vulnerability analysis (for confidentiality and integrity), whereby, new attack models are explored.  Furthermore, the data-driven models are also utilized to create a defensive mechanism to minimize the information leakage from the system and to detect attacks to the integrity of the system. The cyber-physical manufacturing systems are taken as use cases to demonstrate the applicability of the modeling approaches.

Finally, the side-channel analysis is also performed to aid in modeling digital twins of the cyber-physical systems. Specifically, a novel methodology to utilize low-end sensors to analyze the side-channels and build the digital twins is proposed. These digital twins are used to capture the interaction between the cyber-domain and the physical domain of the manufacturing systems, and aid in process quality inference and fault localization. Using side-channels these digital twins are kept up-to-date, which is one of the fundamental requirements for building digital twins.

Cyber-Physical Systems Approach to Irrigation Systems

Title: “Cyber-Physical Systems Approach to Irrigation Systems”

Name: Davit Hovhannisyan

Date: March 5, 2019

Time: 4:00PM

Location: EH 3206  (CECS conference room)

Committee:  Prof. Fadi Kurdahi (Chair), Prof. Ahmed Eltawil, Prof. Mohammad Al Faruque


Semiconductor industry has successfully brought silicon technology to price point that it is accessible for application domains such as irrigation systems, which wastefully utilizes 70% of all fresh water. Moreover, worldwide fresh water resources will soon reach a deficit due to ever growing demand, while the state of the art precision irrigation systems utilize sophisticated water delivery drip lines, yet are only controlled at source by the gut of the end user. This work demonstrates that scientific foundation of cyber-physical systems can be used to design automated, distributed and intelligent precision irrigation systems that improve irrigation efficiency. Thus, this work explores and analyzes in depth the cross section of irrigation practices and cyber-physical systems knowledge to show a path toward a successful adaptation of silicon technology that solves one of the greatest challenges of the 21st century, the fresh water scarcity. To that end, this work presents contributions that complete a novel vision for next generation precision irrigation systems, which can be grouped into three main thrusts: (1) circuit inspired models for irrigation system components and scheduling strategies by analogy method, (2) CPS approach based (a) design methodology capable of comparing irrigation controllers, (b) simulation tools and software for analyzing the distributed behavior of the specialized irrigation controllers, (c) topology adaptation technique that utilizes multi-graphs to mine the hydro-wireless topology of the IoT controllers, and (d) a distributed controller implementation with novel energy harvesting and low power support for irrigation controllers and sensors, (3) overhead vision solutions for health and growth monitoring. The observations, analysis and insight from experimental studies were in collaboration with Rancho California Water District, growers and practitioners.

Control System Design Automation Using Reinforcement Learning

Title: “Control System Design Automation Using Reinforcement Learning”

Name: Hamid Mirzaei

Date: Tuesday, November 20, 2018

Time: 1:00 p.m.

Location: Donald Bren Hall 3011

Committee: Professor Tony Givargis (Chair), Professor Eli Bozorgzadeh, Professor Ian Harris


Conventional control theory has been used in many application domains with great success in the past decades. However, novel solutions are required to cope with the challenges arising from complex interaction of fast growing cyber and physical systems. Specifically, integration of classical control methods with Cyber-Physical System (CPS) design tools is a non-trivial task since those methods have been developed to be used by human expert and are not intended to be part of an automatic design tool.

On the other hand, the control problems in emerging Cyber-Physical Systems, such as intelligent transportation and autonomous driving, cannot be addressed by conventional control methods due to the high level of uncertainty, complex dynamic model requirements and operational and safety constraints.
In this dissertation, a holistic CPS design approach is proposed in which the control algorithm is incorporated as a building block in the design tool. The proposed approach facilitates the inclusion of physical variability into the design process and reduces the parameter space to be explored. This has been done by adding constraints imposed by the control algorithm.
Furthermore, Reinforcement Learning (RL) as a replacement for convection control solutions are studied in the emerging domain of intelligent transportation systems. Specifically, dynamic tolling assignments and autonomous intersection management are tackled by the state-of-the-art RL methods, namely, Trust Region Policy Optimization and Finite-Difference Gradient Descent. Additionally, Q-learning is used to improve the performance of an embedded controller using a novel formulation in which cyber-system actions, such as changing control sampling time, is combined with the physical action set of the RL agent. Using the proposed approach, it is shown that the power consumption and computational overhead of the embedded control can be improved.
Finally, to address the current lack of available physical benchmarks, an open physical environment benchmarking framework is introduced. In the proposed framework, various components of a physical environment are captured in a unified repository to enable researchers to define and share standard benchmarks that can be used to evaluate different reinforcement algorithms. They can also share the realized environments via the cloud to enable other groups perform experiments on the actual physical environments instead of currently available simulation-based environments.


Modeling and Co-Design of Multi-domain Cyber-Physical Systems

Name: Jiang Wan

Date: Friday, June 1

Time: 3:00 — 5:00 PM

Location: EH 3206

Committee: Fadi Kurdahi (Chair), Mohammad Al Faruque, Rainer Doemer


Cyber-Physical Systems (CPS) are integration of computation and physical processes connected through networks. The high complexity of cross-domain engineering in combination with the pressure for system innovation, higher quality, time-to-market, and budget constraints make it imperative for engineers to use integrated engineering methods and tools for CPS design. However, existing computer-based engineering tools are mainly focused on a particular domain and therefore it is challenging to perform system-level analysis for CPS due to the difficulty of knowledge integration from different domains. This thesis studies the problem in the modeling of cross-domain CPS. Problems in the modeling of both functional and non-functional requirements during CPS design are explored and a functional model-based approach is proposed for high level CPS modeling. Moreover, targeting the security requirement in CPS, which is one of the key non-functional requirements in CPS, physics-based models and solutions are proposed in this thesis.


PhD Defense: Reliable and Energy Efficient Battery-Powered Cyber-Physical Systems

Name: Korosh Vatanparvar

Date: Wednesday, May 30th

Time: 3:00 — 5:00 PM

Location: Calit2 3008

Committee: Professor Prof. Mohammad Al Faruque (Chair)


Cyber-Physical Systems (CPS) were presented as a solution to multidisciplinary integration and control in embedded systems. They provide seamless interactions between cyber and physical domains, enabling more intelligent and complicated control applications. However, CPS face the challenges of reliability and energy efficiency since they mainly rely on batteries for power supply.

We investigate these issues with Electric Vehicles (EV) which are common battery-powered CPS. EV were introduced as a mean of transportation to address environmental problems like air and noise pollution. However, their stringent design constraints, especially on battery packs, create challenges of limited driving range and battery lifetime for daily drivers and manufacturers. Design automation community has been addressing these by developing more efficient and dependable devices and control methodologies. Our contributions in this thesis will embrace:

1) novel machine learning and physics-based modeling techniques to capture CPS dynamics more accurately; 2) unique optimization problem formulations to make optimal control decisions; and 3)intelligent control methodologies that leverage the modeling and interaction within CPS to achieve reliable and efficient operation. These contributions are applied to the systems in EV such as navigation system, climate control, and battery management system. Our objectives are to further extend the EV driving range and prolong the battery lifetime while maintaining similar driving experience and comfort for passengers.


Reflective On-Chip Resource Management Policies for Energy-Efficient Heterogeneous Multiprocessors

Name: Tiago Mück

Date: May 16, 2018

Time: 2:00pm

Location: Donald Bren Hall 2011

Committee: Nikil Dutt (Chair), Alex Nicolau, Tony Givargis


Effective exploitation of power-performance tradeoffs in heterogeneous many-core platforms (HMPs), requires intelligent on-chip resource management at different layers, in particular at the operating system level. Operating systems need to continuously analyze the application behavior and find a proper answer for questions such as: What is the most power efficient core type to execute the application without violating its performance requirements? or Which option is more power-efficient for the current application: an out-of-order core at a lower frequency or an inorder core at a higher frequency?
Unfortunately, existing operating systems (e.g. Linux) do not offer mechanisms to properly address these questions and therefore are unable to fully exploit architectural heterogeneity for scalable energy-efficient execution of dynamic workloads.
This dissertation proposes a holistic approach for performing resource allocation decisions and power management by leveraging concepts from reflective software.
The general idea of reflection is to change your actions based on both external feedback and introspection (i.e., self-assessment).
From a practical computer system perspective, reflection means performing resource management actions considering both sensing information (e.g., readings from performance counters, power sensors, etc.) to assess the current system state, as well as models to predict the behavior of the system before performing an action.
In this context, this dissertation describes MARS, a Middleware for Adaptive Reflective computer Systems. MARS consists of a framework and a set of models for creating reflective resource managers. MARS is implemented and evaluated on top a real Linux-based platform. Furthermore, MARS also provides an offline simulation infrastructure for fast prototyping of policies and large-scale or long-term policy evaluation.
Experimental evaluation shows that MARS’s models allow different policies for task mapping and dynamic voltage scaling to be seamlessly integrated, resulting in up-to 1.8x energy efficiency improvements without performance degradation when compared to vanilla Linux.

PhD Defense: SIMD Assisted Fault Detection and Fault Attack Mitigation

Name: Zhi Chen

Date: May 15, 2018

Time: 2:30pm

Location: Donald Bren Hall 3011

Committee: Professor Alex Nicolau (Chair), Alex Veidenbaum, Nikil Dutt


Modern processors continue to aggressively scale down the feature size and reduce voltage levels to run faster and be more energy efficient. However, this trend also poses significant reliability concern as it makes transistors more susceptible to soft errors. Soft errors are transient. Although they don’t impair the computing systems permanently, these errors can corrupt the output of a program or even crash the entire system. Hardware or software redundant techniques could be used to detect errors during the execution of a program. However, hardware redundancy, e.g. DMR (dual-modular redundancy) and TMR (triple-modular redundancy), leads to significant area overhead and very high energy cost. Software redundancy, e.g. instruction duplication, has lower performance and energy penalty and  virtually no hardware cost by sacrificing a small degree of error coverage. Yet commodity processors generally don’t require “five-nines” reliability as they are not mission-critical. Instead, performance and energy consumption have more priority. This dissertation proposes a novel approach to instruction duplication, which exploits the redundancy within SIMD instructions. The key idea is to pack the original data and its duplicate in the different lanes of the same vector register instead of executing two scalar instructions separately as these registers are underutilized on most applications. The proposed solution is implemented in the LLVM compiler as a stand-alone pass. Evaluation on a host of benchmarks reveal that proposed SIMD-based error detection technique causes much less performance, code size, and energy overheads.
This dissertation further extends the proposed approach as a countermeasure to protect cryptographic algorithms. These algorithms are widely adopted in modern processors and embedded systems to protect information. A number of popular cryptographic algorithms in the Libgcrypt library are protected using the SIMD-based instruction duplication technique. A large amount of errors are injected to these algorithms. The results show that almost all injected faults can be detected with reasonable performance and code size cost.


“Efficient Acceleration of Computation Using Associative In-memory Processing”

Title: “Efficient Acceleration of Computation Using Associative In-memory Processing”

Speaker: Hasan Erdem Yantir, University of California, Irvine

Date and Time: Monday, May 14, 2018 at 9:00AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Ahmed Eltawil, Rainer Doemer


The complexity of the computational problems is rising faster than the computational platforms’ capabilities. This forces researchers to find alternative paradigms and methods for efficient computing. One promising paradigm is accelerating compute-intensive kernels using in-memory computing accelerators since memory is the major bottleneck that limits the amount of parallelism and performance of a system and dominates energy consumption in computation. Leveraging the memory intensive nature of big data applications, an in-memory-based computation system can be presented where logic can be replaced by memory structures, virtually eliminating the need for memory load/store operations during computation. The massive parallelism enabled by such a paradigm results in highly scalable structures.

The present thesis is studied against this background. The objective is to conduct a broad perspective research on in-memory computing. For this purpose, associative computing architectures (i.e., Associative Processors, or AP) are built by both traditional (SRAM) and emerging (ReRAM) memory technologies together with their corresponding software frameworks. For ReRAM-based APs, the reliability concerns coming with the emerging memories are resolved. Architectural innovations are developed to increase the energy efficiency. Furthermore, approximate computing approach is introduced for APs to perform efficient/low-power approximate in-memory computing for the tasks which can tolerate some accuracy lost.  The works also propose a novel two-dimensional in-memory computing architecture to cope with the existing deficiencies of the traditional one-dimensional AP architectures.

“A Compiler Infrastructure for Static and Hybrid Analysis of Discrete Event System Models”

Title: “A Compiler Infrastructure for Static and Hybrid Analysis of Discrete Event System Models”

Speaker: Tim Schmidt, University of California, Irvine

Date and Time: Friday, April 20, 2018 at 10:00AM-11:00PM

Location: Engineering Hall 3206

Committee: Professor Rainer Doemer (Chair), Fadi Kurdahi, Kwei-Jay Lin


The design of embedded systems is a well-established research domain for many decades. However, the constantly increasing complexity and requirements of state-of-the-art embed- ded systems pushes designers to new challenges while maintaining established design methodologies. Embedded system design uses the concept of Discrete Event Simulation (DES) to prototype and test the interaction of individual components.

In this dissertation, we provide the Recoding Infrastructure for SystemC (RISC) compiler framework to perform static and hybrid analysis of IEEE SystemC models. On one hand, RISC generates thread communication charts to visualize the communication between individual design components. The
visualization respects the underlying discrete event simulation semantics and illustrates the individual synchronization steps. On the other hand, RISC translates a sequential model into a parallel model
which effectively utilizes multi- and many-core host simulation platforms for faster simulation. This work extends the conflict analysis capabilities for libraries, dynamic memory allocation, channel instance awareness, and references. Additionally, the traditional thread level parallelism is extended with data level parallelism for even faster simulation.

“Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems”

Title: “Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems”

Speaker: Majid Namaki Shoushtari, University of California, Irvine

Date and Time: Tuesday, November 7, 2017 at 11:00AM-12:00PM

Location: Donald Bren Hall 2011

The growing computing demands of emerging application domains such as Recognition/Mining/Synthesis (RMS), visual computing, wearable devices and the Internet of Things (IoT) has driven the move towards manycore architectures to better manage tradeoffs among performance, energy efficiency, and reliability.
The memory hierarchy of manycore architectures has a major impact on their overall performance, energy efficiency and reliability. We identify three major problems that make traditional memory hierarchies unattractive for manycore architectures and their data-intensive workloads: (1) they are power hungry and not a good fit for manycores in face of dark silicon, (2) they are not adaptable to the variable workload’s requirements and memory behavior, and (3) they are not scalable due to coherence overheads.

This thesis argues that many of these inefficiencies are the result of software-agnostic hardware-managed memory hierarchies. Application semantics and behavior captured in software can be exploited to more efficiently manage the memory hierarchy. This thesis exploits some of this information and proposes a number of techniques to mitigate the aforementioned inefficiencies in two broad contexts: (1) explicit management of hybrid cache-SPM memory hierarchy, and (2) exploiting approximate computing to improve the energy efficiency of the memory hierarchy.
We first present the required hardware and software support for a software-assisted memory hierarchy that is composed of distributed memories which can be partitioned between caches and SPMs at runtime. We discuss our SPM APIs, the protocol needed for data movements, and our approach for explicit management of shared data.
Next, we augment caches and SPMs in this hierarchy with approximation support in order to improve the energy efficiency of the memory subsystem when running approximate programs.
Finally, we discuss a quality-configurable memory approximation strategy using formal control theory that adjusts the level of approximation at runtime depending on the desired quality for the program’s output.


Quoc-Viet Dang PhD Defense

Name: Quoc-Viet Dang

Date: May 30, 2017

Time: 02:00 P.M

Location: Engineering Hall 3206

Committee: Professor Daniel Gajski (Chair), Fadi Kurdahi, Rainer Doemer


Public education is on the brink of a potential crisis attempting to significantly increase student enrollment while maintaining quality of education. Online courses have been proposed and debated among members of the UC regents, numerous college administrators, faculty, and students. On one hand, online education can reduce overhead while enrolling more students. Directly translating the classroom lectures and materials to an online environment does not necessarily produce equivalent student performance and satisfaction from the course compared to an in-class environment. Since there is no universal standard for online education, erratic and inconsistent results have been achieved in terms of student performance and costs to students as well as administration. A hybrid scalable teaching and learning methodology is required by both educators and students to achieve the greatest advantages of using today’s technology and apply it toward improving student performance and participation.

This dissertation presents a methodology and system to provide a more individualized and responsive learning environment for students in large hybrid and online university courses while keeping overall costs and time commitment down as well as improve overall student performance. The Universal Personal Advisor, the implemented learning design tool of this research, is developed based on multi-disciplinary metrics and studies from the fields of Psychology, Education, and Engineering. A primary limiting resource for both students and instructors is time. By automating some basic key interactions that may occur between students and instructors, hours of each individual’s time can be saved, maximizing the quality of the available in-person interactions to occur during a course while allowing for a more scalable sized classroom environment.

PhD Defense: Robust Data Hiding in Multimedia for Authentication and Ownership Protection

Name: Farhan A. Alenizi

Date: May 26, 2017

Time: 10:00 AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Professor Ahmed Eltawil and Professor Rainer Doemer


Establishing robust and blind data hiding techniques in multimedia is very important for authentication, ownership protection and security. The multimedia being used may include images, videos and 3D mesh objects. A hybrid pyramid Discrete-Wavelet-Transform (DWT) Singular-Value-Decomposition (SVD) data hiding scheme for video authentication and ownership protection is proposed. The data being hidden will be in the shape of a main  color logo image watermark and another secondary Black and White (B&W) logo image.   The color watermark will be decomposed to Bit-Slices.   A pyramid transform is performed on the Y-frames of a video stream resulting in error images;  then, a Discrete Wavelet Transform (DWT) process is implemented using orthonormal filter banks on these  error images, and the Bit-Slices watermarks are inserted in one or more of the resulting subbands in a way that is fully controlled by the owner; then, the watermarked video is reconstructed. SVD will be performed on the color watermark Bit-Slices. A secondary B&W watermark will be inserted in the main color watermark using another SVD process.     The reconstruction was perfect without attacks, while the average Bit-Error-Rates (BER’s) achieved under attacks are in the limits of  2% for the color watermark and 5% for the secondary watermark; meanwhile,  the mean Peak Signal-to-Noise Ratio (PSNR) is 57 dB. Furthermore, a selective denoising filter to eliminate the noise in video frames is proposed; and the performance with data hiding is evaluated.

Moreover, a 3D mesh blind  optimized watermarking   technique is proposed in this research. The technique relies on  the displacement process of  the vertices locations depending on the modification of the  variances of the vertices’s norms. Statistical analysis were performed to establish the proper distributions that best fit each mesh, and hence establishing the bins sizes.   Experimental results showed that the approach is robust in terms of both the perceptual  and the quantitative  qualities.

In conclusion,  the degree of robustness and security  of the  proposed techniques are shown. Also the schemes that can be adopted to further enhance the performance,  and the future work that can be done in the field are introduced.


PhD Defense: Millimeter-wave and Sub-THz Signal Generation and Detection in Silicon Technologies

Name: Peyman Nazari

Date: May 15, 2017

Time: 10:00 AM

Committee: Prof. Payam Heydari (Chair). Prof. Michael Green, Prof. Ahmed Eltawil


MM-wave/sub-Terahertz (THz) signal generation, radiation, and detection have become increasingly attractive due to its fast-growing applications in spectroscopy, radar, biomedical and security imaging as well as high-speed wireless communication.
Silicon technology, in one hand, offering high-density signal processing capabilities due to aggressive scaling of its feature size, and on the other hand, allowing integration of mm-wave/THz antenna elements owing to their shrunk footprint at these bands, is well-suited for implementation of fully-integrated multi-antenna mm-wave/THz wireless System-on-Chips (SoC’s).
Performance of such system is dominantly governed by the quality and efficiency of signal generation, transmission, and detection. Passive and active components as means of realizing these functionalities must be optimized for operation at these frequency ranges. However, excessive loss of on-chip passive components and limited gain and output power of transistors at such high frequencies demand novel passive and active structures. Furthermore, high level of integration implies that the co-design of front-end components leads to a better end-to-end performance, thus a holistic design methodology must be employed. Radiation characteristics of the wireless signal must also be engineered to improve its transmission quality. For example, circularly polarized radiation is found to be a viable choice for many imaging and communication applications by exhibiting excellent robustness against de-polarization effects.
In this dissertation, silicon realization of on-chip waveguides, as low loss mediums for high-frequency wave propagation, is explored and implementations of low-loss cavity-backed passives are discussed. Furthermore, a silicon-integrated IMPATT diode, together with its fabrication and modeling is introduced as a solution for obtaining active behavior beyond fmax of transistors. Next, a high-power/efficiency mm-wave circularly-polarized cavity-backed radiator, employing a multi-port multi-function passive network as a resonator, power combiner, and antenna, is introduced. Necessary conditions for robust operation of such multi-port oscillators/radiators are also derived. Fabricated in a 0.13µm SiGe BiCMOS process, the prototype chip achieves 14.2dBm EIRP, -99.3dBc/Hz phase noise at 1MHz offset, and 5.2% DC-to-EIRP conversion efficiency which is the highest reported value among silicon-based radiators not using silicon lens or substrate processing.
Finally, a 210GHz low noise amplifier (LNA) is presented to address the detection challenges. This LNA, achieves 18dB of gain, with less than 12dB noise-figure and 3dB bandwidth of more than 15GHz, thereby showing best performance metrics among prior work. This is achieved by incorporating circuit and EM techniques enabling simultaneous optimization of stable gain-, noise- and bandwidth-performance parameters at this frequency range.

PhD Defense: Optimizing Many-Threads-to-Many-Cores Mapping in Parallel Electronic System Level Simulation

Name: Guantao Liu

Date: March 2, 2017

Time: 4:00 PM

Location: Engineering Hall 3206

Committee: Rainer Doemer (Chair), Kwei-Jay Lin, Mohammad Al Faruque


In hardware/software co-design, Discrete Event Simulation (DES) has been in use for decades to verify and validate the functionality of Electronic System Level (ESL) models. Since the parallel computing platforms are readily available today, many Parallel Discrete Event Simulation (PDES) approaches are proposed to improve the simulation performance. However, as the thread parallelism increases in ESL designs and core count multiplies on multi-core and many-core platforms, thread-to-core mapping becomes critical in PDES.

In this dissertation, we propose a computation- and communication-aware approach to optimize thread mapping for parallel ESL simulation, with the aims of load balancing and communication minimization. As we identify that the order of dispatching parallel threads has a significant influence on the total simulation time, and Longest Job First (LJF) shows better performance than the Linux default thread dispatch policy, we first propose a segment- aware LJF scheduler for PDES. Our segment-aware scheduler can accurately predict the run time of the thread segments ahead, and thus make better dispatching decisions. Next, we define the concept of core distance for multi-core and many-core architectures, which quantifies core-to-core communication latency and characterizes processor hierarchies. For many-core architectures using directory-based cache coherence protocols, we observe that core-to-core transfers are not always significantly faster than main memory accesses, and the core-to-core communication latency depends not only on the physical placement on the chip, but also on the location of the distributed cache tag directory. Thus, using a memory ping-pong benchmark, we quantify the core distance on a ring-network many-core platform and propose an algorithm to optimize thread-to-core mapping in order to minimize on-chip communication overhead. Altogether, based on a static analysis of communication patterns and core distance and a dynamic profiling of computation load, our proposed framework utilizes a heuristic graph partitioning algorithm and automatically generates an optimized thread mapping, which minimizes inter-chip communication overhead. In our systematic evaluation, our approach consistently shows a significant performance gain on top of the order-of-magnitude speedup of PDES.

The contributions of this dissertation include a segment-aware multi-core scheduler, core distance profiling, a communication-aware thread mapping framework, together with an open-source software package for Out-of-Order PDES.

PhD Defense: Low Power Reliable Design using Pulsed Latch Circuits

Name: Wael Mahmoud Elsharkasy

Date: February 15, 2017

Time: 11:00 AM

Location: Engineering Hall 3206

Committee: Prof. Fadi J. Kurdahi, Prof. Ahmed Eltawil, Rainer Doemer


System-on-Chip (SoC) faced lots of challenges over the past decade. With nowadays applications centered around Internet-of-Everything (IoE), these challenges are expected to be more critical. Among these challenges are the reduction of power consumption for better energy efficiency, the overcoming of different sources of variations to ensure reliable operation and the reduction of design area to reduce the cost and increase the integration. As a result, chip designers find themselves facing lots of problems, trying to build reliable systems that integrate complex level of functionality, on a minimum die size and with a limited power budgets. Among different circuit components in every chip, memory components are of great concern. They consume the majority of the chip area and power, in addition to affecting the entire chip performance and reliability. These include large memory arrays, caches, register files and different sequential elements in the logic paths. Sequential elements play an important and critical role in modern synchronous CMOS circuits. Indeed, they can represent up to 50% of the standard cells used in a chip. In addition, the power consumption of the clock tree, including these elements can be more than half of the total chip power. In addition, they come in the second place after memory to be affected by different sources of variation. Hence, efficient implementation of these elements is of great importance for the design of energy efficient and reliable integrated circuits. Pulsed latches have been proposed as efficient replacement of flip-flops in the implementation of sequential element. They can achieve higher performance when compared to traditional flip-flop, and can be designed to be smaller in area and more power efficient. However, the operation of pulsed latch is more sensitive to process, voltage and temperature (PVT) variations. In this thesis, we are proposing a methodology to study the reliability of pulsed latches and we have used it to evaluate the effect of PVT variations on their behavior. In addition, novel approaches to enhance the reliability of pulsed latches without significant degradation in performance, area or power are presented. Also, since sequential elements can be used to build small size register files, pulsed latch implementation of register files are discussed and compared to other traditional implementations, including SRAM and flip-flops. In addition, since multiport register files are very beneficial for quite few applications, novel implementations of multiport register files are also presented. The proposed implementation is proved to highly reduce the significant overhead in area, power and latency associated with the traditional way of designing multiport register files.