Seminars by CECS

PhD Defense: Novel Monitoring, Detection, and Optimization Techniques for Neurological Disorders

Name: Seyede Mahya Safavi

Chair: Prof. Pai Chou

Date: February 20th, 2020

Time: 10:00 AM

Location:  EH 3404

Committee: Prof. Pai Chou (Chair), Prof. Beth Lopour, Prof. Phillip Sheu

Title: “Novel Monitoring, Detection, and Optimization Techniques for Neurological Disorders”


The advancement in chronically implanted neural recording devices has led to advent of assisting devices for rehabilitation and restoring lost sensorimotor functions in patients suffering from paralysis. Electrocorticogram (ECoG) signal can record high-Gamma sub-band activity known to be related to hand movements. In the first part of this work, we Propose a finger movement detection technique based on ECoG source localization. In fact, the finger flexion and extension are originating in slightly different areas of motor cortex. The origin of the brain activity is used as the distinctive feature for decoding the finger movement.

The real-time implementation of brain source localization is challenging due to extensive iterations in the existing solutions. In the second part of this work, we have proposed two techniques to reduce the computational complexity of the Multiple Signal Classification (MUSIC) algorithm. First the cortex surface is parsed into several regions. Next, a novel nominating procedure will pick a number of regions to be searched for brain activity. In the second step, an electrode selection technique based on the Cramer-Rao bound of the errors is proposed to select the best set of an arbitrary number of electrodes. The proposed techniques lead to 90% reduction in computational complexity while maintaining a good concordance in terms of localization error compared to regular MUSIC algorithm.

Epilepsy is a neurological disorder with multiple comorbid conditions including cardiovascular and respiratory disorders. The cardiovascular imbalance is of great importance since the mechanisms of Sudden Unexpected Death in Epilepsy (SUDEP) is still unknown. The ictal tachycardia is the most well-known cardiac imbalance during the seizure. In the third part of this dissertation, we used an optical sensor called photoplethysmogram (PPG) to investigate the variations in ictal blood flow in limbs. Six different features related to hemodynamics were derived from PPG pulse morphology. A consistent pattern of ictal change was observed across all the subjects/seizures. These variations suggest an increase in vascular resistance due to an increase in sympathetic tone. The timing analysis of the PPG features revealed some PPG feature variations can precede the ictal tachycardia by 50 seconds. These features were used to train a neural network based on Long Short Term Memory LSTM architecture for automatic seizure detection. We were able to reduce the False Alarm rate by 50% compared to other heart rate variability based detectors.

PhD Defense: Efficient Offline and Online Training of Memristive Neuromorphic Hardware

Name: Mohammed Fouda

Chair: Prof. Ahmed Eltawil

Date: February 6th, 2020

Time: 9:30 AM

Location:  EH 3206

Committee: Ahmed Eltawil (Chair), Prof. Fadi Kurdahi, Prof. Nikil Dutt, Prof. Emre Neftci

Title: “Efficient Offline and Online Training of Memristive Neuromorphic Hardware”


Brain-inspired neuromorphic systems have witnessed rapid development over the last decade from both algorithmic and hardware perspectives. Neuromorphic hardware promises to be more energy- and speed- efficient as compared to traditional Von-Neumann architectures. Thanks to the recent progress in solid-state devices, different nanoscale-nonvolatile memory devices, such as RRAMs (memristors), STT-RAM and PCM, support computations based on mimicking biological synaptic response. The most important advantage of these devices is their ability to be sandwiched between interconnect wires creating crossbar array structures that are inherently able to perform matrix-vector multiplication (MVM) in one step. Despite the great potential of RRAMs, they suffer from numerous nonidealities limiting the performance, including, high variability, asymmetric and nonlinear weight update, endurance, retention and stuck at fault (SAF) defects in addition to the interconnect wire resistance that creates sneak paths. This thesis will focus on the application of RRAMs for neuromorphic computation while accounting for the impact of device nonidealities on neuromorphic hardware.

In this thesis, we propose software-level solutions to mitigate the impact of nonidealities, that highly affect the offline (ex-situ) training, without incorporating expensive SPICE or numerical simulations. We propose two techniques to incorporate the effect of sneak path problem during training, in addition to the device’s variability, with negligible overhead. The first technique is inspired by the impact of the sneak path problem on the stored weights (devices’ conductances) which we referred to as the mask technique. This mask is element-wise multiplied by the weights during the training. This mask can be obtained from measured weights of fabricated hardware. The other solution is a neural network estimator which is trained by our SPICE-like simulator. The test validation results, done through our SPICE-like framework, show significant improvement in performance, close to the baseline BNNs and QNNs, which demonstrates the efficiency of the proposed methods. Both techniques show the high ability to capture the problem for multilayer perceptron networks for MNIST dataset with negligible runtime overhead. In addition, the proposed neural estimator outperforms the mask technique for challenging datasets such as CIFAR10. Furthermore, other nonidealities such as SAF defects and retention are evaluated.

We also develop a model to incorporate the stochastic asymmetric nonlinear weight update in online (in-situ) training. We propose two solutions for this problem; 1) a compensation technique which is tested on a small scale problem to separate two Laplacian mixed sources using online independent component analysis. 2) stochastic rounding and is tested on a spiking neural network with deep local learning dynamics showing only a 1~2\% drop in the baseline accuracy for three different RRAM devices. We also propose Error-triggered learning to overcome the limited endurance problem with only 0.3% and 3% drop in the accuracy for N-MNIST and DVSGesture datasets with around 33X and 100X reduction in the number of writes, respectively.

Finally, the prospects of this neuromorphic hardware are discussed to develop new algorithms with the existing resistive crossbar hardware including its nonidealities.

PhD Defense: Towards Engineering Computer Vision Systems: From Web to FPGAs

Final Defense – Sajjad Taheri

Date: August 26, 2019

Time: 2:00 pm

Location: Donald Bren Hall 4011

Committee: Alex Nicolau(chair), Alex Veidenbaum(co-chair), Nikil Dutt

Title: Towards Engineering Computer Vision Systems: From Web to FPGAs
Computer vision has many applications that impact our daily lives, such as automation, entertainment, healthcare, etc. However, computer vision is very challenging. This is in part due to intrinsically difficult nature of the problem and partly due to the complexity and size of visual data that need to be processed. To be able to deploy computer vision in many practical use cases, sophisticated algorithms and efficient implementations are required. In this dissertation, we consider two platforms that are suitable for computer vision processing, yet they were not easily accessible to algorithm designers and developers: The Web and FPGA-based Accelerators. Through the development of open-source software, we highlight challenges associated with vision development on each platform and demonstrate opportunities to mitigate them.
The Web is the world’s most ubiquitous computing platform which hosts a plethora of visual content. Due to historical reasons such as insufficient JavaScript performance and lack of API support for acquiring and manipulating images, computer vision is not mainstream on the Web. We show that in light of recent web developments such as vastly improved JavaScript performance and addition of APIs such as WebRTC, efficient computer vision processing can be realized on web clients. Through novel engineering techniques, we translate a popular open-source computer vision library (OpenCV) from C++ to JavaScript and optimize its performance for the web environment. We demonstrate that hundreds of computer vision functions run in browsers with performance close to their original C++ version.
Field Programmable Gate Arrays (FPGA)s are a promising solution to mitigate the computational cost of vision algorithms through hardware pipelining and parallelism while providing excellent power efficiency. However, an efficient FPGA implementation of vision algorithm requires hardware design expertise and a considerable amount of engineering person-hours. We show how high-level graph-based specifications, such as OpenVX can significantly improve FPGA design productivity. Since such abstractions exclude implementation details, different implementation configurations that satisfy various design constraints, such as performance and power consumption, can be explored systematically. They also enable a variety of local and global optimizations to apply across the algorithms.

PhD Defense: Event Detection and Estimation Using Distributed Population Owned Sensors

Name: Ahmed Mokhtar Nagy Ibrahim

Chair: Prof. Ahmed Eltawil

Date: August 7th, 2019

Time: 1:00 pm

Location:  EH 2210 – EECS Chair’s Conference Room

Committee: Ahmed Eltawil(Chair), Ender Ayanoglu, and Lee Swindlehurst

Title:”Event Detection and Estimation Using Distributed Population Owned Sensors ”

Smart phones are an indispensable tool in modern day-to-day life. Their widespread use has spawned numerous applications targeting diverse domains such as bio-medical, environment sensing and infrastructure monitoring. In such applications, the accuracy of the sensors at the core of the system is still questionable, since these devices are not originally designed for high accuracy sensing purposes. In this thesis, we investigate the accuracy limits of one of the commonly used sensors, namely, a smart phone accelerometer. As a use case, we focus on utilizing smart phone accelerometers in structural health monitoring (SHM). Using the already deployed network of distributed citizen-owned sensors is considered a cheap alternative to standalone sensors. These devices can capture floors vibration during disasters, and consequently compute the instantaneous displacement of each floor. Hence, damage indicators defined by government standards such as peak relative displacement can be estimated. In this work, we study the displacement estimation accuracy and propose a zero-velocity update (ZUPT) method for noise cancellation. Theoretical derivation and experimental validation are presented, and we discuss the impact of sensor error on the achieved building classification accuracy. Moreover, in spite of the presence of sensor error, SHM systems can be resilient by adopting machine learning. Several algorithms such as support vector machine (SVM), K-nearest neighbor (KNN) and convolutional neural network (CNN) are adopted and compared. Techniques for addressing noise levels are proposed and the results are compared to regular noise cancellation techniques such as filtering.

Finally, since most previous work focused on modelling the sensor chip error itself, we study other sources of error such as sampling time uncertainty which is introduced by the device operating system (OS). That type of error can be considered a major contributor to the overall error, specially for sufficiently large signals. Hence, we propose a novel smart device accelerometer error model that includes the traditional additive noise as well as sampling time uncertainty errors. The model is validated experimentally using shake table experiments, and maximum likely-hood estimation (MLE) is used to estimate the model parameters. Moreover, we derive the Cramer-Rao lower bound (CRLB) of acceleration estimation based on the proposed model.

PhD Defense: Cooperative Power and Resource Management for Heterogeneous Mobile Architectures

Name: Chenying Hsieh

Date: Wednesday Aug 7, 2019

Time: 2:00 pm
Location: Donald Bren Hall 4011
Committee: Nikil Dutt (Chair), Tony Givargis, and Ardalan Amiri Sani
Title: Cooperative Power and Resource Management for Heterogeneous Mobile Architectures
Heterogeneous architectures have been ubiquitous in mobile system-on-chips (SoCs). The demand from different application domains such as games, computer vision and machine learning which requires massive parallelism of computation has driven the integration of more accelerators into mobile SoCs to provide satisfactory performance energy-efficiently.These on-chip computing resources typically have their individual runtime systems including: (1) a software governor: continuously monitors hardware utilization and makes decisions of trade-off between performance and power consumption. (2) software stack: allows application developers to program the hardware for general purpose computation and perform memory management and profiling. As computation of mobile applications may demand all sorts of combinations of computing resources, we identify two problems: (1) individual runtime can often lead to poor performance-power trade-off or inefficient utilization of computing resources. (2) existing approaches fail to schedule subprograms among different computing resources and further lose the opportunity to avoid resource contention to gain better performance.
To address these issues, we propose a holistic approach to coordinate different runtime regrading application performance and energy efficiency in this dissertation. We first present MEMCOP, a memory-aware collaborative CPU-GPU governor that considers both the memory access footprint as well as the CPU/GPU frequency to improve energy efficiency of high-end mobile game workloads by performing dynamic voltage and frequency scaling (DVFS). Second, We present a case study executing a mix of popular data-parallel workloads such as convolutional neural networks (CNNs), computer vision filters and graphics rendering kernels on mobile devices, and show that both performance and energy consumption of mobile platforms can be improved by synergistically deploying these underutilized compute resources. Third, we present SURF: a self-aware unified runtime framework for parallel programs on heterogeneous mobile architectures. SURF supports several heterogeneous parallel programming languages (including OpenMP and OpenCL), and enables dynamic task-mapping to heterogeneous resources based on runtime measurement and prediction.The measurement and monitoring loop enables self-aware adaptation of run-time mapping to exploit the best available resource dynamically.
We implemented all the software components on real-world mobile SoCs and evaluate our proposed approaches with mobile games and mix of parallel benchmarks and CNN applications accordingly.

PhD Defense: Low-Power Integrated Circuits For Biomedical Applications

Title: “Low-Power Integrated Circuits For Biomedical Applications”

Name: Karimi Bidhendi, Alireza

Chair: Professor Payam Heydari

Date: August 6, 2019

Time: 1 pm

Location: Calit2,  Room 3008


With thousands new cases of spinal cord injury reported everyday, many people suffer from paralysis and loss of sensation in both legs. Beside the healthcare costs, such a state severely deteriorates a patients quality of life and may even lead to additional medical conditions. Therefore, there is a growing need for cyber-physical systems to restore the walking ability through bypassing the damaged spinal cord. This goal can be achieved by monitoring and processing patient’s brain signals to enable brain-directed control of prosthetic legs. Among several existing methods to record brain signals, electrocorticography (ECoG) has gained popularity due to being robust to motion artifacts, having high spatial resolution and signal to noise ratio, being moderately invasive and the possibility of chronic implantation of recording grids with no or minor scar tissue formation. The latest property is of particular importance for the whole system to be a viable fully implantable solution. Furthermore, the implanted system has to operate independently with no or minimal need of external hardware (e.g. a bulky personal computer) to be individually and socially accepted.

To implement a fully implantable system, low-power and miniaturized electronics are needed to reduce heat generation, increase battery life-time and be minimally intrusive. These requirements indicate that many of the system’s components should be custom-designed to integrated as much functionality as possible in a given real estate. This thesis presents silicon tested prototypes of several building blocks for the envisioned system, namely, ultra low-power brain signal acquisition front-ends, a low-power and inductorless MedRadio transceiver, and a fast start-up crystal oscillator. Brain signal acquisition front-ends provide low noise amplification of weak ECoG biosignals. MedRadio transceiver enables communication between the implant and end effectors or base station (e.g. prosthetic legs or desktop computer). Crystal oscillators generate the reference signal for other system components such as analog to digital converters. Novel techniques to improve important performance parameters (power consumption, low noise operation and interference resilience) have been introduced. Electrical, in-vitro and in-vivo experimental measurements have verified the functionality and performance of each design.

Data-Driven Modeling of Cyber-Physical Systems using Side-Channel Analysis

Title: “Data-Driven Modeling of Cyber-Physical Systems using Side-Channel Analysis”

Name: Sujit Rokka Chhetri

Date: May 20, 2019

Time: 3:00 PM

Location: EH 5200

Committee: Prof. Mohammad Al Faruque (Chair), Prof. Pramod Khargonekar, Prof. Fadi Kurdahi


Cyber-Physical System consists of the integration of computational components in the cyber-domain with the physical-domain processes. In cyber-domain, embedded computers and networks monitor and control the physical processes, and in the physical-domain the sensors and actuators aid in interacting with the physical world. This interaction between the cyber and physical domain brings unique modeling challenges one of which includes the integration of discrete models in cyber-domain with the continuous physical domain processes. However, the same cyber-physical interaction also opens new opportunities for modeling. For example, the information flow in the cyber-domain manifests physically in the form of energy flows in the physical domain. Some of these energy flows unintentionally provide information about the cyber-domain through the side-channels.

In this thesis, the extensive analysis of the side-channels (such as acoustic, magnetic, thermal, power and vibration) of the cyber-physical system is performed. Based on this analysis data-driven models are estimated. These models are then used to perform security vulnerability analysis (for confidentiality and integrity), whereby, new attack models are explored.  Furthermore, the data-driven models are also utilized to create a defensive mechanism to minimize the information leakage from the system and to detect attacks to the integrity of the system. The cyber-physical manufacturing systems are taken as use cases to demonstrate the applicability of the modeling approaches.

Finally, the side-channel analysis is also performed to aid in modeling digital twins of the cyber-physical systems. Specifically, a novel methodology to utilize low-end sensors to analyze the side-channels and build the digital twins is proposed. These digital twins are used to capture the interaction between the cyber-domain and the physical domain of the manufacturing systems, and aid in process quality inference and fault localization. Using side-channels these digital twins are kept up-to-date, which is one of the fundamental requirements for building digital twins.

Cyber-Physical Systems Approach to Irrigation Systems

Title: “Cyber-Physical Systems Approach to Irrigation Systems”

Name: Davit Hovhannisyan

Date: March 5, 2019

Time: 4:00PM

Location: EH 3206  (CECS conference room)

Committee:  Prof. Fadi Kurdahi (Chair), Prof. Ahmed Eltawil, Prof. Mohammad Al Faruque


Semiconductor industry has successfully brought silicon technology to price point that it is accessible for application domains such as irrigation systems, which wastefully utilizes 70% of all fresh water. Moreover, worldwide fresh water resources will soon reach a deficit due to ever growing demand, while the state of the art precision irrigation systems utilize sophisticated water delivery drip lines, yet are only controlled at source by the gut of the end user. This work demonstrates that scientific foundation of cyber-physical systems can be used to design automated, distributed and intelligent precision irrigation systems that improve irrigation efficiency. Thus, this work explores and analyzes in depth the cross section of irrigation practices and cyber-physical systems knowledge to show a path toward a successful adaptation of silicon technology that solves one of the greatest challenges of the 21st century, the fresh water scarcity. To that end, this work presents contributions that complete a novel vision for next generation precision irrigation systems, which can be grouped into three main thrusts: (1) circuit inspired models for irrigation system components and scheduling strategies by analogy method, (2) CPS approach based (a) design methodology capable of comparing irrigation controllers, (b) simulation tools and software for analyzing the distributed behavior of the specialized irrigation controllers, (c) topology adaptation technique that utilizes multi-graphs to mine the hydro-wireless topology of the IoT controllers, and (d) a distributed controller implementation with novel energy harvesting and low power support for irrigation controllers and sensors, (3) overhead vision solutions for health and growth monitoring. The observations, analysis and insight from experimental studies were in collaboration with Rancho California Water District, growers and practitioners.

Control System Design Automation Using Reinforcement Learning

Title: “Control System Design Automation Using Reinforcement Learning”

Name: Hamid Mirzaei

Date: Tuesday, November 20, 2018

Time: 1:00 p.m.

Location: Donald Bren Hall 3011

Committee: Professor Tony Givargis (Chair), Professor Eli Bozorgzadeh, Professor Ian Harris


Conventional control theory has been used in many application domains with great success in the past decades. However, novel solutions are required to cope with the challenges arising from complex interaction of fast growing cyber and physical systems. Specifically, integration of classical control methods with Cyber-Physical System (CPS) design tools is a non-trivial task since those methods have been developed to be used by human expert and are not intended to be part of an automatic design tool.

On the other hand, the control problems in emerging Cyber-Physical Systems, such as intelligent transportation and autonomous driving, cannot be addressed by conventional control methods due to the high level of uncertainty, complex dynamic model requirements and operational and safety constraints.
In this dissertation, a holistic CPS design approach is proposed in which the control algorithm is incorporated as a building block in the design tool. The proposed approach facilitates the inclusion of physical variability into the design process and reduces the parameter space to be explored. This has been done by adding constraints imposed by the control algorithm.
Furthermore, Reinforcement Learning (RL) as a replacement for convection control solutions are studied in the emerging domain of intelligent transportation systems. Specifically, dynamic tolling assignments and autonomous intersection management are tackled by the state-of-the-art RL methods, namely, Trust Region Policy Optimization and Finite-Difference Gradient Descent. Additionally, Q-learning is used to improve the performance of an embedded controller using a novel formulation in which cyber-system actions, such as changing control sampling time, is combined with the physical action set of the RL agent. Using the proposed approach, it is shown that the power consumption and computational overhead of the embedded control can be improved.
Finally, to address the current lack of available physical benchmarks, an open physical environment benchmarking framework is introduced. In the proposed framework, various components of a physical environment are captured in a unified repository to enable researchers to define and share standard benchmarks that can be used to evaluate different reinforcement algorithms. They can also share the realized environments via the cloud to enable other groups perform experiments on the actual physical environments instead of currently available simulation-based environments.


PhD Defense: An Initial Study of Simple Approaches To Eliminating Out-of-Thin-Air Results

Name: Peizhao Ou

Chair: Prof. Brian Demsky

Date: Tuesday October 16, 2018

Time: 10am – 12pm

Location: CALIT2 Room 3008

Title: An Initial Study of Simple Approaches To Eliminating Out-of-Thin-Air Results

Eliminating so-called “out-of-thin-air” (OOTA) results is an open problem in many existing programming language memory models including Java, C, and C++.  OOTA behaviors are problematic in that they break both formal and informal modular reasoning about program behavior. In spite of many years of research
efforts, defining memory model semantics that are easily understood, allow existing optimizations, and forbid OOTA results remains an open problem. This thesis explores two simple solutions to this problem that forbid OOTA results. The first solution is targeted towards Java-like languages in which all memory operations may create OOTA executions, and the second solution is targeted towards C/C++-like memory models in which racing operations are explicitly labeled as atomic operations. Our solutions provide a per-candidate execution criterion that makes it possible to examine a single execution and determine whether the memory model permits the execution. We implemented and evaluated both solutions in the LLVM compiler framework. Our results show that on an ARMv8 processor the first solution has an average overhead of 3.1% and a maximum overhead of 17.6% on the SPEC CPU2006 C/C++ benchmarks, and that the second solution has no overhead on average and a maximum overhead of 6.3% on 43 concurrent data structures.  The results indicate that these simple approaches to eliminating out-of-thin-air behaviors deserve further consideration.

Modeling and Co-Design of Multi-domain Cyber-Physical Systems

Name: Jiang Wan

Date: Friday, June 1

Time: 3:00 — 5:00 PM

Location: EH 3206

Committee: Fadi Kurdahi (Chair), Mohammad Al Faruque, Rainer Doemer


Cyber-Physical Systems (CPS) are integration of computation and physical processes connected through networks. The high complexity of cross-domain engineering in combination with the pressure for system innovation, higher quality, time-to-market, and budget constraints make it imperative for engineers to use integrated engineering methods and tools for CPS design. However, existing computer-based engineering tools are mainly focused on a particular domain and therefore it is challenging to perform system-level analysis for CPS due to the difficulty of knowledge integration from different domains. This thesis studies the problem in the modeling of cross-domain CPS. Problems in the modeling of both functional and non-functional requirements during CPS design are explored and a functional model-based approach is proposed for high level CPS modeling. Moreover, targeting the security requirement in CPS, which is one of the key non-functional requirements in CPS, physics-based models and solutions are proposed in this thesis.


PhD Defense: Reliable and Energy Efficient Battery-Powered Cyber-Physical Systems

Name: Korosh Vatanparvar

Date: Wednesday, May 30th

Time: 3:00 — 5:00 PM

Location: Calit2 3008

Committee: Professor Prof. Mohammad Al Faruque (Chair)


Cyber-Physical Systems (CPS) were presented as a solution to multidisciplinary integration and control in embedded systems. They provide seamless interactions between cyber and physical domains, enabling more intelligent and complicated control applications. However, CPS face the challenges of reliability and energy efficiency since they mainly rely on batteries for power supply.

We investigate these issues with Electric Vehicles (EV) which are common battery-powered CPS. EV were introduced as a mean of transportation to address environmental problems like air and noise pollution. However, their stringent design constraints, especially on battery packs, create challenges of limited driving range and battery lifetime for daily drivers and manufacturers. Design automation community has been addressing these by developing more efficient and dependable devices and control methodologies. Our contributions in this thesis will embrace:

1) novel machine learning and physics-based modeling techniques to capture CPS dynamics more accurately; 2) unique optimization problem formulations to make optimal control decisions; and 3)intelligent control methodologies that leverage the modeling and interaction within CPS to achieve reliable and efficient operation. These contributions are applied to the systems in EV such as navigation system, climate control, and battery management system. Our objectives are to further extend the EV driving range and prolong the battery lifetime while maintaining similar driving experience and comfort for passengers.


Reflective On-Chip Resource Management Policies for Energy-Efficient Heterogeneous Multiprocessors

Name: Tiago Mück

Date: May 16, 2018

Time: 2:00pm

Location: Donald Bren Hall 2011

Committee: Nikil Dutt (Chair), Alex Nicolau, Tony Givargis


Effective exploitation of power-performance tradeoffs in heterogeneous many-core platforms (HMPs), requires intelligent on-chip resource management at different layers, in particular at the operating system level. Operating systems need to continuously analyze the application behavior and find a proper answer for questions such as: What is the most power efficient core type to execute the application without violating its performance requirements? or Which option is more power-efficient for the current application: an out-of-order core at a lower frequency or an inorder core at a higher frequency?
Unfortunately, existing operating systems (e.g. Linux) do not offer mechanisms to properly address these questions and therefore are unable to fully exploit architectural heterogeneity for scalable energy-efficient execution of dynamic workloads.
This dissertation proposes a holistic approach for performing resource allocation decisions and power management by leveraging concepts from reflective software.
The general idea of reflection is to change your actions based on both external feedback and introspection (i.e., self-assessment).
From a practical computer system perspective, reflection means performing resource management actions considering both sensing information (e.g., readings from performance counters, power sensors, etc.) to assess the current system state, as well as models to predict the behavior of the system before performing an action.
In this context, this dissertation describes MARS, a Middleware for Adaptive Reflective computer Systems. MARS consists of a framework and a set of models for creating reflective resource managers. MARS is implemented and evaluated on top a real Linux-based platform. Furthermore, MARS also provides an offline simulation infrastructure for fast prototyping of policies and large-scale or long-term policy evaluation.
Experimental evaluation shows that MARS’s models allow different policies for task mapping and dynamic voltage scaling to be seamlessly integrated, resulting in up-to 1.8x energy efficiency improvements without performance degradation when compared to vanilla Linux.

PhD Defense: SIMD Assisted Fault Detection and Fault Attack Mitigation

Name: Zhi Chen

Date: May 15, 2018

Time: 2:30pm

Location: Donald Bren Hall 3011

Committee: Professor Alex Nicolau (Chair), Alex Veidenbaum, Nikil Dutt


Modern processors continue to aggressively scale down the feature size and reduce voltage levels to run faster and be more energy efficient. However, this trend also poses significant reliability concern as it makes transistors more susceptible to soft errors. Soft errors are transient. Although they don’t impair the computing systems permanently, these errors can corrupt the output of a program or even crash the entire system. Hardware or software redundant techniques could be used to detect errors during the execution of a program. However, hardware redundancy, e.g. DMR (dual-modular redundancy) and TMR (triple-modular redundancy), leads to significant area overhead and very high energy cost. Software redundancy, e.g. instruction duplication, has lower performance and energy penalty and  virtually no hardware cost by sacrificing a small degree of error coverage. Yet commodity processors generally don’t require “five-nines” reliability as they are not mission-critical. Instead, performance and energy consumption have more priority. This dissertation proposes a novel approach to instruction duplication, which exploits the redundancy within SIMD instructions. The key idea is to pack the original data and its duplicate in the different lanes of the same vector register instead of executing two scalar instructions separately as these registers are underutilized on most applications. The proposed solution is implemented in the LLVM compiler as a stand-alone pass. Evaluation on a host of benchmarks reveal that proposed SIMD-based error detection technique causes much less performance, code size, and energy overheads.
This dissertation further extends the proposed approach as a countermeasure to protect cryptographic algorithms. These algorithms are widely adopted in modern processors and embedded systems to protect information. A number of popular cryptographic algorithms in the Libgcrypt library are protected using the SIMD-based instruction duplication technique. A large amount of errors are injected to these algorithms. The results show that almost all injected faults can be detected with reasonable performance and code size cost.


“Efficient Acceleration of Computation Using Associative In-memory Processing”

Title: “Efficient Acceleration of Computation Using Associative In-memory Processing”

Speaker: Hasan Erdem Yantir, University of California, Irvine

Date and Time: Monday, May 14, 2018 at 9:00AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Ahmed Eltawil, Rainer Doemer


The complexity of the computational problems is rising faster than the computational platforms’ capabilities. This forces researchers to find alternative paradigms and methods for efficient computing. One promising paradigm is accelerating compute-intensive kernels using in-memory computing accelerators since memory is the major bottleneck that limits the amount of parallelism and performance of a system and dominates energy consumption in computation. Leveraging the memory intensive nature of big data applications, an in-memory-based computation system can be presented where logic can be replaced by memory structures, virtually eliminating the need for memory load/store operations during computation. The massive parallelism enabled by such a paradigm results in highly scalable structures.

The present thesis is studied against this background. The objective is to conduct a broad perspective research on in-memory computing. For this purpose, associative computing architectures (i.e., Associative Processors, or AP) are built by both traditional (SRAM) and emerging (ReRAM) memory technologies together with their corresponding software frameworks. For ReRAM-based APs, the reliability concerns coming with the emerging memories are resolved. Architectural innovations are developed to increase the energy efficiency. Furthermore, approximate computing approach is introduced for APs to perform efficient/low-power approximate in-memory computing for the tasks which can tolerate some accuracy lost.  The works also propose a novel two-dimensional in-memory computing architecture to cope with the existing deficiencies of the traditional one-dimensional AP architectures.