Seminars by CECS

“Efficient Acceleration of Computation Using Associative In-memory Processing”

Title: “Efficient Acceleration of Computation Using Associative In-memory Processing”

Speaker: Hasan Erdem Yantir, University of California, Irvine

Date and Time: Monday, May 14, 2018 at 9:00AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Ahmed Eltawil, Rainer Doemer


The complexity of the computational problems is rising faster than the computational platforms’ capabilities. This forces researchers to find alternative paradigms and methods for efficient computing. One promising paradigm is accelerating compute-intensive kernels using in-memory computing accelerators since memory is the major bottleneck that limits the amount of parallelism and performance of a system and dominates energy consumption in computation. Leveraging the memory intensive nature of big data applications, an in-memory-based computation system can be presented where logic can be replaced by memory structures, virtually eliminating the need for memory load/store operations during computation. The massive parallelism enabled by such a paradigm results in highly scalable structures.

The present thesis is studied against this background. The objective is to conduct a broad perspective research on in-memory computing. For this purpose, associative computing architectures (i.e., Associative Processors, or AP) are built by both traditional (SRAM) and emerging (ReRAM) memory technologies together with their corresponding software frameworks. For ReRAM-based APs, the reliability concerns coming with the emerging memories are resolved. Architectural innovations are developed to increase the energy efficiency. Furthermore, approximate computing approach is introduced for APs to perform efficient/low-power approximate in-memory computing for the tasks which can tolerate some accuracy lost.  The works also propose a novel two-dimensional in-memory computing architecture to cope with the existing deficiencies of the traditional one-dimensional AP architectures.

“A Compiler Infrastructure for Static and Hybrid Analysis of Discrete Event System Models”

Title: “A Compiler Infrastructure for Static and Hybrid Analysis of Discrete Event System Models”

Speaker: Tim Schmidt, University of California, Irvine

Date and Time: Friday, April 20, 2018 at 10:00AM-11:00PM

Location: Engineering Hall 3206

Committee: Professor Rainer Doemer (Chair), Fadi Kurdahi, Kwei-Jay Lin


The design of embedded systems is a well-established research domain for many decades. However, the constantly increasing complexity and requirements of state-of-the-art embed- ded systems pushes designers to new challenges while maintaining established design methodologies. Embedded system design uses the concept of Discrete Event Simulation (DES) to prototype and test the interaction of individual components.

In this dissertation, we provide the Recoding Infrastructure for SystemC (RISC) compiler framework to perform static and hybrid analysis of IEEE SystemC models. On one hand, RISC generates thread communication charts to visualize the communication between individual design components. The
visualization respects the underlying discrete event simulation semantics and illustrates the individual synchronization steps. On the other hand, RISC translates a sequential model into a parallel model
which effectively utilizes multi- and many-core host simulation platforms for faster simulation. This work extends the conflict analysis capabilities for libraries, dynamic memory allocation, channel instance awareness, and references. Additionally, the traditional thread level parallelism is extended with data level parallelism for even faster simulation.

“Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems”

Title: “Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems”

Speaker: Majid Namaki Shoushtari, University of California, Irvine

Date and Time: Tuesday, November 7, 2017 at 11:00AM-12:00PM

Location: Donald Bren Hall 2011

The growing computing demands of emerging application domains such as Recognition/Mining/Synthesis (RMS), visual computing, wearable devices and the Internet of Things (IoT) has driven the move towards manycore architectures to better manage tradeoffs among performance, energy efficiency, and reliability.
The memory hierarchy of manycore architectures has a major impact on their overall performance, energy efficiency and reliability. We identify three major problems that make traditional memory hierarchies unattractive for manycore architectures and their data-intensive workloads: (1) they are power hungry and not a good fit for manycores in face of dark silicon, (2) they are not adaptable to the variable workload’s requirements and memory behavior, and (3) they are not scalable due to coherence overheads.

This thesis argues that many of these inefficiencies are the result of software-agnostic hardware-managed memory hierarchies. Application semantics and behavior captured in software can be exploited to more efficiently manage the memory hierarchy. This thesis exploits some of this information and proposes a number of techniques to mitigate the aforementioned inefficiencies in two broad contexts: (1) explicit management of hybrid cache-SPM memory hierarchy, and (2) exploiting approximate computing to improve the energy efficiency of the memory hierarchy.
We first present the required hardware and software support for a software-assisted memory hierarchy that is composed of distributed memories which can be partitioned between caches and SPMs at runtime. We discuss our SPM APIs, the protocol needed for data movements, and our approach for explicit management of shared data.
Next, we augment caches and SPMs in this hierarchy with approximation support in order to improve the energy efficiency of the memory subsystem when running approximate programs.
Finally, we discuss a quality-configurable memory approximation strategy using formal control theory that adjusts the level of approximation at runtime depending on the desired quality for the program’s output.


Quoc-Viet Dang PhD Defense

Name: Quoc-Viet Dang

Date: May 30, 2017

Time: 02:00 P.M

Location: Engineering Hall 3206

Committee: Professor Daniel Gajski (Chair), Fadi Kurdahi, Rainer Doemer


Public education is on the brink of a potential crisis attempting to significantly increase student enrollment while maintaining quality of education. Online courses have been proposed and debated among members of the UC regents, numerous college administrators, faculty, and students. On one hand, online education can reduce overhead while enrolling more students. Directly translating the classroom lectures and materials to an online environment does not necessarily produce equivalent student performance and satisfaction from the course compared to an in-class environment. Since there is no universal standard for online education, erratic and inconsistent results have been achieved in terms of student performance and costs to students as well as administration. A hybrid scalable teaching and learning methodology is required by both educators and students to achieve the greatest advantages of using today’s technology and apply it toward improving student performance and participation.

This dissertation presents a methodology and system to provide a more individualized and responsive learning environment for students in large hybrid and online university courses while keeping overall costs and time commitment down as well as improve overall student performance. The Universal Personal Advisor, the implemented learning design tool of this research, is developed based on multi-disciplinary metrics and studies from the fields of Psychology, Education, and Engineering. A primary limiting resource for both students and instructors is time. By automating some basic key interactions that may occur between students and instructors, hours of each individual’s time can be saved, maximizing the quality of the available in-person interactions to occur during a course while allowing for a more scalable sized classroom environment.

PhD Defense: Robust Data Hiding in Multimedia for Authentication and Ownership Protection

Name: Farhan A. Alenizi

Date: May 26, 2017

Time: 10:00 AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Professor Ahmed Eltawil and Professor Rainer Doemer


Establishing robust and blind data hiding techniques in multimedia is very important for authentication, ownership protection and security. The multimedia being used may include images, videos and 3D mesh objects. A hybrid pyramid Discrete-Wavelet-Transform (DWT) Singular-Value-Decomposition (SVD) data hiding scheme for video authentication and ownership protection is proposed. The data being hidden will be in the shape of a main  color logo image watermark and another secondary Black and White (B&W) logo image.   The color watermark will be decomposed to Bit-Slices.   A pyramid transform is performed on the Y-frames of a video stream resulting in error images;  then, a Discrete Wavelet Transform (DWT) process is implemented using orthonormal filter banks on these  error images, and the Bit-Slices watermarks are inserted in one or more of the resulting subbands in a way that is fully controlled by the owner; then, the watermarked video is reconstructed. SVD will be performed on the color watermark Bit-Slices. A secondary B&W watermark will be inserted in the main color watermark using another SVD process.     The reconstruction was perfect without attacks, while the average Bit-Error-Rates (BER’s) achieved under attacks are in the limits of  2% for the color watermark and 5% for the secondary watermark; meanwhile,  the mean Peak Signal-to-Noise Ratio (PSNR) is 57 dB. Furthermore, a selective denoising filter to eliminate the noise in video frames is proposed; and the performance with data hiding is evaluated.

Moreover, a 3D mesh blind  optimized watermarking   technique is proposed in this research. The technique relies on  the displacement process of  the vertices locations depending on the modification of the  variances of the vertices’s norms. Statistical analysis were performed to establish the proper distributions that best fit each mesh, and hence establishing the bins sizes.   Experimental results showed that the approach is robust in terms of both the perceptual  and the quantitative  qualities.

In conclusion,  the degree of robustness and security  of the  proposed techniques are shown. Also the schemes that can be adopted to further enhance the performance,  and the future work that can be done in the field are introduced.


PhD Defense: Millimeter-wave and Sub-THz Signal Generation and Detection in Silicon Technologies

Name: Peyman Nazari

Date: May 15, 2017

Time: 10:00 AM

Committee: Prof. Payam Heydari (Chair). Prof. Michael Green, Prof. Ahmed Eltawil


MM-wave/sub-Terahertz (THz) signal generation, radiation, and detection have become increasingly attractive due to its fast-growing applications in spectroscopy, radar, biomedical and security imaging as well as high-speed wireless communication.
Silicon technology, in one hand, offering high-density signal processing capabilities due to aggressive scaling of its feature size, and on the other hand, allowing integration of mm-wave/THz antenna elements owing to their shrunk footprint at these bands, is well-suited for implementation of fully-integrated multi-antenna mm-wave/THz wireless System-on-Chips (SoC’s).
Performance of such system is dominantly governed by the quality and efficiency of signal generation, transmission, and detection. Passive and active components as means of realizing these functionalities must be optimized for operation at these frequency ranges. However, excessive loss of on-chip passive components and limited gain and output power of transistors at such high frequencies demand novel passive and active structures. Furthermore, high level of integration implies that the co-design of front-end components leads to a better end-to-end performance, thus a holistic design methodology must be employed. Radiation characteristics of the wireless signal must also be engineered to improve its transmission quality. For example, circularly polarized radiation is found to be a viable choice for many imaging and communication applications by exhibiting excellent robustness against de-polarization effects.
In this dissertation, silicon realization of on-chip waveguides, as low loss mediums for high-frequency wave propagation, is explored and implementations of low-loss cavity-backed passives are discussed. Furthermore, a silicon-integrated IMPATT diode, together with its fabrication and modeling is introduced as a solution for obtaining active behavior beyond fmax of transistors. Next, a high-power/efficiency mm-wave circularly-polarized cavity-backed radiator, employing a multi-port multi-function passive network as a resonator, power combiner, and antenna, is introduced. Necessary conditions for robust operation of such multi-port oscillators/radiators are also derived. Fabricated in a 0.13µm SiGe BiCMOS process, the prototype chip achieves 14.2dBm EIRP, -99.3dBc/Hz phase noise at 1MHz offset, and 5.2% DC-to-EIRP conversion efficiency which is the highest reported value among silicon-based radiators not using silicon lens or substrate processing.
Finally, a 210GHz low noise amplifier (LNA) is presented to address the detection challenges. This LNA, achieves 18dB of gain, with less than 12dB noise-figure and 3dB bandwidth of more than 15GHz, thereby showing best performance metrics among prior work. This is achieved by incorporating circuit and EM techniques enabling simultaneous optimization of stable gain-, noise- and bandwidth-performance parameters at this frequency range.

PhD Defense: Optimizing Many-Threads-to-Many-Cores Mapping in Parallel Electronic System Level Simulation

Name: Guantao Liu

Date: March 2, 2017

Time: 4:00 PM

Location: Engineering Hall 3206

Committee: Rainer Doemer (Chair), Kwei-Jay Lin, Mohammad Al Faruque


In hardware/software co-design, Discrete Event Simulation (DES) has been in use for decades to verify and validate the functionality of Electronic System Level (ESL) models. Since the parallel computing platforms are readily available today, many Parallel Discrete Event Simulation (PDES) approaches are proposed to improve the simulation performance. However, as the thread parallelism increases in ESL designs and core count multiplies on multi-core and many-core platforms, thread-to-core mapping becomes critical in PDES.

In this dissertation, we propose a computation- and communication-aware approach to optimize thread mapping for parallel ESL simulation, with the aims of load balancing and communication minimization. As we identify that the order of dispatching parallel threads has a significant influence on the total simulation time, and Longest Job First (LJF) shows better performance than the Linux default thread dispatch policy, we first propose a segment- aware LJF scheduler for PDES. Our segment-aware scheduler can accurately predict the run time of the thread segments ahead, and thus make better dispatching decisions. Next, we define the concept of core distance for multi-core and many-core architectures, which quantifies core-to-core communication latency and characterizes processor hierarchies. For many-core architectures using directory-based cache coherence protocols, we observe that core-to-core transfers are not always significantly faster than main memory accesses, and the core-to-core communication latency depends not only on the physical placement on the chip, but also on the location of the distributed cache tag directory. Thus, using a memory ping-pong benchmark, we quantify the core distance on a ring-network many-core platform and propose an algorithm to optimize thread-to-core mapping in order to minimize on-chip communication overhead. Altogether, based on a static analysis of communication patterns and core distance and a dynamic profiling of computation load, our proposed framework utilizes a heuristic graph partitioning algorithm and automatically generates an optimized thread mapping, which minimizes inter-chip communication overhead. In our systematic evaluation, our approach consistently shows a significant performance gain on top of the order-of-magnitude speedup of PDES.

The contributions of this dissertation include a segment-aware multi-core scheduler, core distance profiling, a communication-aware thread mapping framework, together with an open-source software package for Out-of-Order PDES.

PhD Defense: Low Power Reliable Design using Pulsed Latch Circuits

Name: Wael Mahmoud Elsharkasy

Date: February 15, 2017

Time: 11:00 AM

Location: Engineering Hall 3206

Committee: Prof. Fadi J. Kurdahi, Prof. Ahmed Eltawil, Rainer Doemer


System-on-Chip (SoC) faced lots of challenges over the past decade. With nowadays applications centered around Internet-of-Everything (IoE), these challenges are expected to be more critical. Among these challenges are the reduction of power consumption for better energy efficiency, the overcoming of different sources of variations to ensure reliable operation and the reduction of design area to reduce the cost and increase the integration. As a result, chip designers find themselves facing lots of problems, trying to build reliable systems that integrate complex level of functionality, on a minimum die size and with a limited power budgets. Among different circuit components in every chip, memory components are of great concern. They consume the majority of the chip area and power, in addition to affecting the entire chip performance and reliability. These include large memory arrays, caches, register files and different sequential elements in the logic paths. Sequential elements play an important and critical role in modern synchronous CMOS circuits. Indeed, they can represent up to 50% of the standard cells used in a chip. In addition, the power consumption of the clock tree, including these elements can be more than half of the total chip power. In addition, they come in the second place after memory to be affected by different sources of variation. Hence, efficient implementation of these elements is of great importance for the design of energy efficient and reliable integrated circuits. Pulsed latches have been proposed as efficient replacement of flip-flops in the implementation of sequential element. They can achieve higher performance when compared to traditional flip-flop, and can be designed to be smaller in area and more power efficient. However, the operation of pulsed latch is more sensitive to process, voltage and temperature (PVT) variations. In this thesis, we are proposing a methodology to study the reliability of pulsed latches and we have used it to evaluate the effect of PVT variations on their behavior. In addition, novel approaches to enhance the reliability of pulsed latches without significant degradation in performance, area or power are presented. Also, since sequential elements can be used to build small size register files, pulsed latch implementation of register files are discussed and compared to other traditional implementations, including SRAM and flip-flops. In addition, since multiport register files are very beneficial for quite few applications, novel implementations of multiport register files are also presented. The proposed implementation is proved to highly reduce the significant overhead in area, power and latency associated with the traditional way of designing multiport register files.

PhD Defense: Runtime Memory Management in Many-core Systems

Name: Hossein Tajik

Date: November 15, 2016

Time: 3:00PM – 4:00PM

Location: DBH 3011 Conference Room

Committee: Nikil Dutt (Chair), Tony Givargis, Alex Nicolau


With the increasing number of cores on a chip, we are moving towards an era where many-core platforms will soon be ubiquitous. Efficient use of tens to hundreds of cores on a chip and their memory resources comes with unique challenges.

In this dissertation, we propose SPMPool: a scalable platform for sharing Software Programmable Memories (SPMs). The SPMPool approach exploits underutilized memory resources by dynamically sharing SPM resources between applications running on different cores and adapts to the overall memory requirements of multiple applications that are concurrently executing on the many-core platform. We propose both central and distributed management schemes for SPMPool and study the efficiency of auction-based mechanisms in solving the memory mapping problem. We also propose offline and online memory phase detection methods in order to increase the adaptivity of memory management to temporal changes in memory requirements of a single application. The runtime memory management schemes proposed in this dissertation enable better performance and power for many-core systems.

PhD Defense: On Optimizing the Performance of Interference-Limited Wireless Systems

Name: Rana A. Abdelaal

Date: January 30, 2017

Time: 1:00 PM

Location: Engineering Hall 3106

Committee: Professor Ahmed Eltawil


Multi Input Multi Output (MIMO) technology has seen prolific use to achieve higher data rates and an improved communication experience for cellular systems. However, one of the challenging problems in MIMO systems is interference. Interference limits the system performance in terms of rate and reliability. In this thesis, we analyze methods that provide high performance over interference-limited wireless networks such as Long Term Evolution (LTE) and WiFi. In this thesis, we tackle different sources of interference. One of the interference sources is the neighboring interference, we propose methods that include an optimized solution that models the interference as correlated noise, and uses its statistical information to jointly optimize the base station precoding and user receiver design of LTE systems. We study the benefits of exploiting interference in terms of both probability of error and signal-to-noise ratio (SNR). In addition, we compare the proposed method with the conventional beamforming and maximum ratio combining (MRC). One of the key challenges to enable high data rates in the downlink of LTE is the precoding and receiver design. We focus primarily on the UE and the base station (BS) processing, particularly on estimating and using the interference resulting from neighboring stations. In this thesis, we propose a receiver design that performs well in the presence of interference. Furthermore, we present a precoding scheme that the BS can use to maximize the signal-to-interference plus noise-ratio (SINR). An interference free scenario is used as a benchmark to evaluate the proposed system performance. In this thesis, we optimize the performance of LTE by tackling practical considerations that affects the system performance. We present a suboptimal practical way of estimating the interference and utilizing this information on the processing techniques used at both the UE and the eNodeB sides. We focus on managing both MU-MIMO interference and other cell interference. The proposed study improves system performance even under non-perfect channel knowledge, enabling the throughput gains promised by MU-MIMO.

Other types of interference exist in In-band full-duplex (IBFD) communication systems. IBFD is very promising in enhancing wireless LANs, where full-duplex access points (APs) can support simultaneous uplink (UL) and downlink (DL) flows over the same frequency channel. One of the key challenges limiting IBFD benefits is interference. In this thesis, we also propose a scheduling technique to manage interference in wireless LANs with full-duplex capability. We focus primarily on scheduling UL and DL stations (STAs) that can be efficiently served simultaneously.

It is very important to apply system knowledge to reduce power and/or improve performance. Thus, we also aim at exploring energy and power management techniques for practical wireless communication systems. An important topic for practical communication systems is handling the interference due to the power amplifier nonlinearities. Managing this type of interference is of very high importance, especially in Orthogonal Frequency-Division Multiple Access (OFDMA) based communication systems. Although, OFDMA is the modulation of choice due to its robustness to time-dispersive radio channels, low-complexity receivers, and simple combining of

signals from multiple transmitters in broadcast networks. The transmitter design for OFDMA is more costly, as the Peak-to-Average Power Ratio (PAPR) of an OFDMA signal is relatively high, resulting in the need for highly linear RF power amplifiers (PA). This problem becomes more compounded when a large number of PAs is required, as in Massive MIMO for example. In this thesis, we discuss the impact of PAs on cellular systems. We show the constraints that PAs introduce, and we take these constraints into consideration while searching for the optimum set of transmitter and receiver filters. Moreover, we highlight how Massive MIMO cellular networks can relax PAs constraints resulting in low cost PAs, while maintaining high performance. The performance is evaluated by showing the probability of error curves and signal-to-noise-ratio curves for different transmit powers and different number of transmit antennas. Another promising topic for efficient practical communication systems is Associative processors (APs). AP is a good candidate for in-memory computation, however it has been deemed too costly and energy hungry in the past. The advent of ultra-dense resistive memories is changing this paradigm, allowing for efficient in-memory associative processors. However, with high levels of integration, issues related to power density become the bottleneck. In this thesis, we show the potential use of the approximate computing in wireless communication systems, specifically, we present Fast Fourier Transform (FFT) implemented by associative processor. Results confirm that approximate computing for in-memory associative processors is a viable approach to reduce power consumption while maintaining good performance. A promising approach to save energy is through reducing the bit width, however reducing the bit width introduces errors that may affect the performance. In this thesis, our goal is to adjust the bit width based on the channel SNR, aiming at achieving good performance at reduced energy consumption. The mathematical approach that analytically describes the system performance under the reduced bit width noise is presented. Based on this model, an adaptive bit width adjustment algorithm is presented that utilizes the received SNR estimates to find the optimal bit width that achieves performance goals at reduced energy consumption. Simulation results show that the proposed algorithms can achieve up to 45% energy savings as compared to wireless communication systems with conventional FFT.


PhD Defense: Scalable Runtime support for Edge-To-Cloud integration of Distributed Sensing Systems

Name: Brett Chien

Date: November 29, 2016

Time: 10:00AM

Location: EH 2210

Committee: Pai H. Chou (Chair)


While Internet-Of-Things (IoT) has drawn more attention to researchers and the public, to build a complete system from the edge sensing units to the cloud services requires massive amount of efforts. Researchers with strong interests in collected information are often lost in various technologies, including distributed sensing embedded systems, bridge devices between Internet and local network, and data backend services.

This work takes a cross-system, script-based, and semantic-enhanced approach to address the problem of lacking suitable runtime supports. We proposed a threaded code runtime support for edge sensing systems, a script based wrapper on Physical-to-Cyber bridges, and scalable middleware into the backend services.

With proposed runtime supports, we are able to apply distributed sensing systems into real world applications quickly and explorer insights from collected information. As a result, a building structure monitoring system is installed and allow civil researchers to develop algorithms to prevent disaster events. Body area sensing systems such as ECG monitoring, CO2 detection, and body movement are developed. This enables baby screening and detect potential heart problems. The results have shown that with proposed runtime supports applications can be realized quickly and scalable.

PhD Defense: Resource Aggregation for Collaborative Projected Video from Multiple Mobile Devices

Name: Hung Nguyen

Date: November 17, 2016

Time: 1:30 P.M.

Location: EH 3206

Committee: Fadi Kurdahi (Chair), Aditi Majumder (Co-Chair)


We explore and develop an embedded real time system and associated algorithms that enable an aggregation of limited resource, low-quality, projection-enabled mobile devices to collaboratively produce a higher quality video stream for a superior viewing experience. Such a resource aggregation across multiple projector enabled devices can lead to a per unit resource savings while moving the cost to the aggregate.

The pico-projectors that are embedded in mobile devices such as cell phones have a much lower resolution and brightness than standard projectors. Tiling (putting the projection area of multiple projectors in a rectangular array overlapping them slightly around the boundary) and superimposing (putting the projection area of multiple projectors right on top of each other) multiples of such projectors, registered via automated registration through the cameras residing within those mobile devices, result in different ways of aggregating resources across these multiple devices. Evaluation of our proof-of-concept system shows significant improvement for each mobile device in two primary factors of bandwidth usage and power consumption when using a collaborative federation of projection-embedded mobile devices.

The portable, low-power, light weight, small size pico-projectors are key components of projection-enabled mobile devices for the future. Due to the reduction of weight and dimension and the portability nature of the projector-enabled mobile devices, the calibrated integrated systems are prone to physical un-stabilizing of the projected image during the presentation. Thus the auto re-calibration and projected video stabilization features during the presentation time becomes essential requirements to enhance user experience. The design, algorithm, and implementation methods for these features will be presented in the second part of the dissertation.

PhD Defense: Specification and Runtime Verification of Distributed Multiprocessor Systems: Languages, Tools and Architectures

Name: Ahmed Nassar

Date: September 2, 2016

Time: 12:00 P.M.

Location: EH 3403

Committee: Fadi Kurdahi (Chair), Rainer Doemer, Ahmed Eltawil

Abstract: Post-Deployment runtime verification (RV) has recently emerged as a complementary technology to extend coverage of conventional software verification and testing methods. This thesis is an attempt to tackle three major barriers that need to be surmounted before RV technologies become in widespread use:
Barrier-1: Lack of an expressive, yet efficiently monitorable, specification language. Distributed software behavior is projected onto an observation interface consisting of data-carrying (or parameterized) events, such as Linux system calls including argument values, and self-replicating deterministic finite automata (SR-DFAs) are introduced for RV purposes as well as anomaly-based intrusion detection in embedded and general-purpose software systems based on these parametric traces.
Barrier-2: The substantial performance and power overhead of pure software RV frameworks. NUVA, which stands for nonuniform verification architecture, a distributed automata-based RV architecture for software specifications in the form of SR-DFAs. NUVA has been implemented over a cache-coherent nonuniform-memory-access (ccNUMA) multiprocessor and can be deployed on the FPGA fabric that will reside on all next-generation processor chips. The core of NUVA is a coherent distributed automata transactional memory (ATM) that efficiently maintains states of a dynamic population of automata checkers organized into a rooted dynamic directed acyclic graph (DAG) concurrently shared among all processor nodes.
Barrier-3: Formal specifications are hard to formulate and maintain for evolving complex embedded and general-purpose software systems. Therefore, specification mining has long ago been envisioned to play a key role in software verification, modification and documentation. However, in order to scale beyond simple, library/API-level properties having short temporal spans, specification mining tools need to support more expressive specification languages that can capture complex, application-level properties. This thesis introduces a bio-inspired complete specification mining methodology for SR-DFAs using an iterative and interactive mining tool, called ParaMiner. ParaMiner relies on novel mining algorithms invoking multiple-sequence alignment (MSA) techniques to enable learning specifications from temporal slices of software behavior while overcoming the initial-state uncertainty problem.
SR-DFAs and ParaMiner have been leveraged in a new specification-based intrusion detection (ID) framework that protects distributed, reactive computing systems against cyberattacks having very sparse signatures, arbitrarily long time spans and wide attack fronts. Such attacks lie outside the scope of conventional anomaly-based ID methods which typically work with short event windows and ignore manipulated data objects, such as files and sockets. We demonstrate the effectiveness of the constructed SR-DFAs at classifying as well as resolving subtle behaviors typical of cyberattacks with varying evasion parameter values.

PhD Defense:

Name:  Aras Pirbadian

Date: August 29, 2016

Time: 3:30pm

Location:  EH 3403

Committee: Ahmed Eltawil (Chair), Fadi Kurdahi, Rainer Doemer

With the continued scaling of chip manufacturing technologies, the significance of process variation in performance of the systems is increasing. Specifically, process variation results in growing voltage and frequency overhead margins required to ensure error free operation of circuits. However, the traditional practice of overdesigning the systems to cover process variation is no longer an efficient design methodology in an age with high demands for processing power and limited energy supplies. In this dissertation, a novel analytical model is proposed to predict the required margin accurately in the early stages of design space exploration. The model can be used to optimize the system overhead in error free calculations or to release the bound by full correctness in error tolerant parts of systems and optimize the energy vs. performance trade-off. Additionally, this model also considers the statistics of the inputs of the circuit as compared to other existing efforts enabling it to achieve close predictions of full circuit simulation results in a short time. This model is finally used in an adaptive carry select/ripple carry adder configuration to demonstrate the potential achievable power savings.

Growing variation in newer technology nodes is not always a negative side effect. The increased inherent randomness in the process manufacturing technology can be utilized to develop unique physically unclonable functions (PUFs). These functions are irreproducible hardware-based authenticating systems, which do not require memory-based storage. A low overhead delay-based PUF using the variation of the silicon manufacturing is also proposed in the second part of this work. The proposed PUF uses a simple and efficient structure to convert the randomness of the manufacturing process into random responses to fixed challenges in identically designed circuits.

PhD Defense: Progression and Edge Intelligence Framework for IoT Systems

Name:  Zhenqiu Huang

Date: July 29, 2016

Time: 10:00am

Location:  EH 4106

Committee: Kwei-Jay Lin (Chair), Fadi Kurdahi, Mohammad Al Faruque

Abstract: This thesis studies the issues on building and managing future Internet of Things (IoT)
systems. IoT systems consist of distributed components with services for sensing, processing,
and controlling through devices deployed in our living environment as part of the global
cyber-physical ecosystem.

Systems with perpetually running IoT devices may use a lot of energy. One challenge is to
implement good management policies for energy saving. In addition, a large scale of devices
may be deployed in wide geographical areas through low bandwidth wireless communication
networks. This brings the challenge of con