Seminars by CECS

Modeling and Co-Design of Multi-domain Cyber-Physical Systems

Name: Jiang Wan

Date: Friday, June 1

Time: 3:00 — 5:00 PM

Location: EH 3206

Committee: Fadi Kurdahi (Chair), Mohammad Al Faruque, Rainer Doemer


Cyber-Physical Systems (CPS) are integration of computation and physical processes connected through networks. The high complexity of cross-domain engineering in combination with the pressure for system innovation, higher quality, time-to-market, and budget constraints make it imperative for engineers to use integrated engineering methods and tools for CPS design. However, existing computer-based engineering tools are mainly focused on a particular domain and therefore it is challenging to perform system-level analysis for CPS due to the difficulty of knowledge integration from different domains. This thesis studies the problem in the modeling of cross-domain CPS. Problems in the modeling of both functional and non-functional requirements during CPS design are explored and a functional model-based approach is proposed for high level CPS modeling. Moreover, targeting the security requirement in CPS, which is one of the key non-functional requirements in CPS, physics-based models and solutions are proposed in this thesis.


PhD Defense: Reliable and Energy Efficient Battery-Powered Cyber-Physical Systems

Name: Korosh Vatanparvar

Date: Wednesday, May 30th

Time: 3:00 — 5:00 PM

Location: Calit2 3008

Committee: Professor Prof. Mohammad Al Faruque (Chair)


Cyber-Physical Systems (CPS) were presented as a solution to multidisciplinary integration and control in embedded systems. They provide seamless interactions between cyber and physical domains, enabling more intelligent and complicated control applications. However, CPS face the challenges of reliability and energy efficiency since they mainly rely on batteries for power supply.

We investigate these issues with Electric Vehicles (EV) which are common battery-powered CPS. EV were introduced as a mean of transportation to address environmental problems like air and noise pollution. However, their stringent design constraints, especially on battery packs, create challenges of limited driving range and battery lifetime for daily drivers and manufacturers. Design automation community has been addressing these by developing more efficient and dependable devices and control methodologies. Our contributions in this thesis will embrace:

1) novel machine learning and physics-based modeling techniques to capture CPS dynamics more accurately; 2) unique optimization problem formulations to make optimal control decisions; and 3)intelligent control methodologies that leverage the modeling and interaction within CPS to achieve reliable and efficient operation. These contributions are applied to the systems in EV such as navigation system, climate control, and battery management system. Our objectives are to further extend the EV driving range and prolong the battery lifetime while maintaining similar driving experience and comfort for passengers.


Reflective On-Chip Resource Management Policies for Energy-Efficient Heterogeneous Multiprocessors

Name: Tiago Mück

Date: May 16, 2018

Time: 2:00pm

Location: Donald Bren Hall 2011

Committee: Nikil Dutt (Chair), Alex Nicolau, Tony Givargis


Effective exploitation of power-performance tradeoffs in heterogeneous many-core platforms (HMPs), requires intelligent on-chip resource management at different layers, in particular at the operating system level. Operating systems need to continuously analyze the application behavior and find a proper answer for questions such as: What is the most power efficient core type to execute the application without violating its performance requirements? or Which option is more power-efficient for the current application: an out-of-order core at a lower frequency or an inorder core at a higher frequency?
Unfortunately, existing operating systems (e.g. Linux) do not offer mechanisms to properly address these questions and therefore are unable to fully exploit architectural heterogeneity for scalable energy-efficient execution of dynamic workloads.
This dissertation proposes a holistic approach for performing resource allocation decisions and power management by leveraging concepts from reflective software.
The general idea of reflection is to change your actions based on both external feedback and introspection (i.e., self-assessment).
From a practical computer system perspective, reflection means performing resource management actions considering both sensing information (e.g., readings from performance counters, power sensors, etc.) to assess the current system state, as well as models to predict the behavior of the system before performing an action.
In this context, this dissertation describes MARS, a Middleware for Adaptive Reflective computer Systems. MARS consists of a framework and a set of models for creating reflective resource managers. MARS is implemented and evaluated on top a real Linux-based platform. Furthermore, MARS also provides an offline simulation infrastructure for fast prototyping of policies and large-scale or long-term policy evaluation.
Experimental evaluation shows that MARS’s models allow different policies for task mapping and dynamic voltage scaling to be seamlessly integrated, resulting in up-to 1.8x energy efficiency improvements without performance degradation when compared to vanilla Linux.

PhD Defense: SIMD Assisted Fault Detection and Fault Attack Mitigation

Name: Zhi Chen

Date: May 15, 2018

Time: 2:30pm

Location: Donald Bren Hall 3011

Committee: Professor Alex Nicolau (Chair), Alex Veidenbaum, Nikil Dutt


Modern processors continue to aggressively scale down the feature size and reduce voltage levels to run faster and be more energy efficient. However, this trend also poses significant reliability concern as it makes transistors more susceptible to soft errors. Soft errors are transient. Although they don’t impair the computing systems permanently, these errors can corrupt the output of a program or even crash the entire system. Hardware or software redundant techniques could be used to detect errors during the execution of a program. However, hardware redundancy, e.g. DMR (dual-modular redundancy) and TMR (triple-modular redundancy), leads to significant area overhead and very high energy cost. Software redundancy, e.g. instruction duplication, has lower performance and energy penalty and  virtually no hardware cost by sacrificing a small degree of error coverage. Yet commodity processors generally don’t require “five-nines” reliability as they are not mission-critical. Instead, performance and energy consumption have more priority. This dissertation proposes a novel approach to instruction duplication, which exploits the redundancy within SIMD instructions. The key idea is to pack the original data and its duplicate in the different lanes of the same vector register instead of executing two scalar instructions separately as these registers are underutilized on most applications. The proposed solution is implemented in the LLVM compiler as a stand-alone pass. Evaluation on a host of benchmarks reveal that proposed SIMD-based error detection technique causes much less performance, code size, and energy overheads.
This dissertation further extends the proposed approach as a countermeasure to protect cryptographic algorithms. These algorithms are widely adopted in modern processors and embedded systems to protect information. A number of popular cryptographic algorithms in the Libgcrypt library are protected using the SIMD-based instruction duplication technique. A large amount of errors are injected to these algorithms. The results show that almost all injected faults can be detected with reasonable performance and code size cost.


“Efficient Acceleration of Computation Using Associative In-memory Processing”

Title: “Efficient Acceleration of Computation Using Associative In-memory Processing”

Speaker: Hasan Erdem Yantir, University of California, Irvine

Date and Time: Monday, May 14, 2018 at 9:00AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Ahmed Eltawil, Rainer Doemer


The complexity of the computational problems is rising faster than the computational platforms’ capabilities. This forces researchers to find alternative paradigms and methods for efficient computing. One promising paradigm is accelerating compute-intensive kernels using in-memory computing accelerators since memory is the major bottleneck that limits the amount of parallelism and performance of a system and dominates energy consumption in computation. Leveraging the memory intensive nature of big data applications, an in-memory-based computation system can be presented where logic can be replaced by memory structures, virtually eliminating the need for memory load/store operations during computation. The massive parallelism enabled by such a paradigm results in highly scalable structures.

The present thesis is studied against this background. The objective is to conduct a broad perspective research on in-memory computing. For this purpose, associative computing architectures (i.e., Associative Processors, or AP) are built by both traditional (SRAM) and emerging (ReRAM) memory technologies together with their corresponding software frameworks. For ReRAM-based APs, the reliability concerns coming with the emerging memories are resolved. Architectural innovations are developed to increase the energy efficiency. Furthermore, approximate computing approach is introduced for APs to perform efficient/low-power approximate in-memory computing for the tasks which can tolerate some accuracy lost.  The works also propose a novel two-dimensional in-memory computing architecture to cope with the existing deficiencies of the traditional one-dimensional AP architectures.

“A Compiler Infrastructure for Static and Hybrid Analysis of Discrete Event System Models”

Title: “A Compiler Infrastructure for Static and Hybrid Analysis of Discrete Event System Models”

Speaker: Tim Schmidt, University of California, Irvine

Date and Time: Friday, April 20, 2018 at 10:00AM-11:00PM

Location: Engineering Hall 3206

Committee: Professor Rainer Doemer (Chair), Fadi Kurdahi, Kwei-Jay Lin


The design of embedded systems is a well-established research domain for many decades. However, the constantly increasing complexity and requirements of state-of-the-art embed- ded systems pushes designers to new challenges while maintaining established design methodologies. Embedded system design uses the concept of Discrete Event Simulation (DES) to prototype and test the interaction of individual components.

In this dissertation, we provide the Recoding Infrastructure for SystemC (RISC) compiler framework to perform static and hybrid analysis of IEEE SystemC models. On one hand, RISC generates thread communication charts to visualize the communication between individual design components. The
visualization respects the underlying discrete event simulation semantics and illustrates the individual synchronization steps. On the other hand, RISC translates a sequential model into a parallel model
which effectively utilizes multi- and many-core host simulation platforms for faster simulation. This work extends the conflict analysis capabilities for libraries, dynamic memory allocation, channel instance awareness, and references. Additionally, the traditional thread level parallelism is extended with data level parallelism for even faster simulation.

“Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems”

Title: “Software Assists to On-chip Memory Hierarchy of Manycore Embedded Systems”

Speaker: Majid Namaki Shoushtari, University of California, Irvine

Date and Time: Tuesday, November 7, 2017 at 11:00AM-12:00PM

Location: Donald Bren Hall 2011

The growing computing demands of emerging application domains such as Recognition/Mining/Synthesis (RMS), visual computing, wearable devices and the Internet of Things (IoT) has driven the move towards manycore architectures to better manage tradeoffs among performance, energy efficiency, and reliability.
The memory hierarchy of manycore architectures has a major impact on their overall performance, energy efficiency and reliability. We identify three major problems that make traditional memory hierarchies unattractive for manycore architectures and their data-intensive workloads: (1) they are power hungry and not a good fit for manycores in face of dark silicon, (2) they are not adaptable to the variable workload’s requirements and memory behavior, and (3) they are not scalable due to coherence overheads.

This thesis argues that many of these inefficiencies are the result of software-agnostic hardware-managed memory hierarchies. Application semantics and behavior captured in software can be exploited to more efficiently manage the memory hierarchy. This thesis exploits some of this information and proposes a number of techniques to mitigate the aforementioned inefficiencies in two broad contexts: (1) explicit management of hybrid cache-SPM memory hierarchy, and (2) exploiting approximate computing to improve the energy efficiency of the memory hierarchy.
We first present the required hardware and software support for a software-assisted memory hierarchy that is composed of distributed memories which can be partitioned between caches and SPMs at runtime. We discuss our SPM APIs, the protocol needed for data movements, and our approach for explicit management of shared data.
Next, we augment caches and SPMs in this hierarchy with approximation support in order to improve the energy efficiency of the memory subsystem when running approximate programs.
Finally, we discuss a quality-configurable memory approximation strategy using formal control theory that adjusts the level of approximation at runtime depending on the desired quality for the program’s output.


Quoc-Viet Dang PhD Defense

Name: Quoc-Viet Dang

Date: May 30, 2017

Time: 02:00 P.M

Location: Engineering Hall 3206

Committee: Professor Daniel Gajski (Chair), Fadi Kurdahi, Rainer Doemer


Public education is on the brink of a potential crisis attempting to significantly increase student enrollment while maintaining quality of education. Online courses have been proposed and debated among members of the UC regents, numerous college administrators, faculty, and students. On one hand, online education can reduce overhead while enrolling more students. Directly translating the classroom lectures and materials to an online environment does not necessarily produce equivalent student performance and satisfaction from the course compared to an in-class environment. Since there is no universal standard for online education, erratic and inconsistent results have been achieved in terms of student performance and costs to students as well as administration. A hybrid scalable teaching and learning methodology is required by both educators and students to achieve the greatest advantages of using today’s technology and apply it toward improving student performance and participation.

This dissertation presents a methodology and system to provide a more individualized and responsive learning environment for students in large hybrid and online university courses while keeping overall costs and time commitment down as well as improve overall student performance. The Universal Personal Advisor, the implemented learning design tool of this research, is developed based on multi-disciplinary metrics and studies from the fields of Psychology, Education, and Engineering. A primary limiting resource for both students and instructors is time. By automating some basic key interactions that may occur between students and instructors, hours of each individual’s time can be saved, maximizing the quality of the available in-person interactions to occur during a course while allowing for a more scalable sized classroom environment.

PhD Defense: Robust Data Hiding in Multimedia for Authentication and Ownership Protection

Name: Farhan A. Alenizi

Date: May 26, 2017

Time: 10:00 AM

Location: Engineering Hall 3206

Committee: Professor Fadi Kurdahi (Chair), Professor Ahmed Eltawil and Professor Rainer Doemer


Establishing robust and blind data hiding techniques in multimedia is very important for authentication, ownership protection and security. The multimedia being used may include images, videos and 3D mesh objects. A hybrid pyramid Discrete-Wavelet-Transform (DWT) Singular-Value-Decomposition (SVD) data hiding scheme for video authentication and ownership protection is proposed. The data being hidden will be in the shape of a main  color logo image watermark and another secondary Black and White (B&W) logo image.   The color watermark will be decomposed to Bit-Slices.   A pyramid transform is performed on the Y-frames of a video stream resulting in error images;  then, a Discrete Wavelet Transform (DWT) process is implemented using orthonormal filter banks on these  error images, and the Bit-Slices watermarks are inserted in one or more of the resulting subbands in a way that is fully controlled by the owner; then, the watermarked video is reconstructed. SVD will be performed on the color watermark Bit-Slices. A secondary B&W watermark will be inserted in the main color watermark using another SVD process.     The reconstruction was perfect without attacks, while the average Bit-Error-Rates (BER’s) achieved under attacks are in the limits of  2% for the color watermark and 5% for the secondary watermark; meanwhile,  the mean Peak Signal-to-Noise Ratio (PSNR) is 57 dB. Furthermore, a selective denoising filter to eliminate the noise in video frames is proposed; and the performance with data hiding is evaluated.

Moreover, a 3D mesh blind  optimized watermarking   technique is proposed in this research. The technique relies on  the displacement process of  the vertices locations depending on the modification of the  variances of the vertices’s norms. Statistical analysis were performed to establish the proper distributions that best fit each mesh, and hence establishing the bins sizes.   Experimental results showed that the approach is robust in terms of both the perceptual  and the quantitative  qualities.

In conclusion,  the degree of robustness and security  of the  proposed techniques are shown. Also the schemes that can be adopted to further enhance the performance,  and the future work that can be done in the field are introduced.


PhD Defense: Millimeter-wave and Sub-THz Signal Generation and Detection in Silicon Technologies

Name: Peyman Nazari

Date: May 15, 2017

Time: 10:00 AM

Committee: Prof. Payam Heydari (Chair). Prof. Michael Green, Prof. Ahmed Eltawil


MM-wave/sub-Terahertz (THz) signal generation, radiation, and detection have become increasingly attractive due to its fast-growing applications in spectroscopy, radar, biomedical and security imaging as well as high-speed wireless communication.
Silicon technology, in one hand, offering high-density signal processing capabilities due to aggressive scaling of its feature size, and on the other hand, allowing integration of mm-wave/THz antenna elements owing to their shrunk footprint at these bands, is well-suited for implementation of fully-integrated multi-antenna mm-wave/THz wireless System-on-Chips (SoC’s).
Performance of such system is dominantly governed by the quality and efficiency of signal generation, transmission, and detection. Passive and active components as means of realizing these functionalities must be optimized for operation at these frequency ranges. However, excessive loss of on-chip passive components and limited gain and output power of transistors at such high frequencies demand novel passive and active structures. Furthermore, high level of integration implies that the co-design of front-end components leads to a better end-to-end performance, thus a holistic design methodology must be employed. Radiation characteristics of the wireless signal must also be engineered to improve its transmission quality. For example, circularly polarized radiation is found to be a viable choice for many imaging and communication applications by exhibiting excellent robustness against de-polarization effects.
In this dissertation, silicon realization of on-chip waveguides, as low loss mediums for high-frequency wave propagation, is explored and implementations of low-loss cavity-backed passives are discussed. Furthermore, a silicon-integrated IMPATT diode, together with its fabrication and modeling is introduced as a solution for obtaining active behavior beyond fmax of transistors. Next, a high-power/efficiency mm-wave circularly-polarized cavity-backed radiator, employing a multi-port multi-function passive network as a resonator, power combiner, and antenna, is introduced. Necessary conditions for robust operation of such multi-port oscillators/radiators are also derived. Fabricated in a 0.13µm SiGe BiCMOS process, the prototype chip achieves 14.2dBm EIRP, -99.3dBc/Hz phase noise at 1MHz offset, and 5.2% DC-to-EIRP conversion efficiency which is the highest reported value among silicon-based radiators not using silicon lens or substrate processing.
Finally, a 210GHz low noise amplifier (LNA) is presented to address the detection challenges. This LNA, achieves 18dB of gain, with less than 12dB noise-figure and 3dB bandwidth of more than 15GHz, thereby showing best performance metrics among prior work. This is achieved by incorporating circuit and EM techniques enabling simultaneous optimization of stable gain-, noise- and bandwidth-performance parameters at this frequency range.

PhD Defense: Optimizing Many-Threads-to-Many-Cores Mapping in Parallel Electronic System Level Simulation

Name: Guantao Liu

Date: March 2, 2017

Time: 4:00 PM

Location: Engineering Hall 3206

Committee: Rainer Doemer (Chair), Kwei-Jay Lin, Mohammad Al Faruque


In hardware/software co-design, Discrete Event Simulation (DES) has been in use for decades to verify and validate the functionality of Electronic System Level (ESL) models. Since the parallel computing platforms are readily available today, many Parallel Discrete Event Simulation (PDES) approaches are proposed to improve the simulation performance. However, as the thread parallelism increases in ESL designs and core count multiplies on multi-core and many-core platforms, thread-to-core mapping becomes critical in PDES.

In this dissertation, we propose a computation- and communication-aware approach to optimize thread mapping for parallel ESL simulation, with the aims of load balancing and communication minimization. As we identify that the order of dispatching parallel threads has a significant influence on the total simulation time, and Longest Job First (LJF) shows better performance than the Linux default thread dispatch policy, we first propose a segment- aware LJF scheduler for PDES. Our segment-aware scheduler can accurately predict the run time of the thread segments ahead, and thus make better dispatching decisions. Next, we define the concept of core distance for multi-core and many-core architectures, which quantifies core-to-core communication latency and characterizes processor hierarchies. For many-core architectures using directory-based cache coherence protocols, we observe that core-to-core transfers are not always significantly faster than main memory accesses, and the core-to-core communication latency depends not only on the physical placement on the chip, but also on the location of the distributed cache tag directory. Thus, using a memory ping-pong benchmark, we quantify the core distance on a ring-network many-core platform and propose an algorithm to optimize thread-to-core mapping in order to minimize on-chip communication overhead. Altogether, based on a static analysis of communication patterns and core distance and a dynamic profiling of computation load, our proposed framework utilizes a heuristic graph partitioning algorithm and automatically generates an optimized thread mapping, which minimizes inter-chip communication overhead. In our systematic evaluation, our approach consistently shows a significant performance gain on top of the order-of-magnitude speedup of PDES.

The contributions of this dissertation include a segment-aware multi-core scheduler, core distance profiling, a communication-aware thread mapping framework, together with an open-source software package for Out-of-Order PDES.

PhD Defense: Low Power Reliable Design using Pulsed Latch Circuits

Name: Wael Mahmoud Elsharkasy

Date: February 15, 2017

Time: 11:00 AM

Location: Engineering Hall 3206

Committee: Prof. Fadi J. Kurdahi, Prof. Ahmed Eltawil, Rainer Doemer


System-on-Chip (SoC) faced lots of challenges over the past decade. With nowadays applications centered around Internet-of-Everything (IoE), these challenges are expected to be more critical. Among these challenges are the reduction of power consumption for better energy efficiency, the overcoming of different sources of variations to ensure reliable operation and the reduction of design area to reduce the cost and increase the integration. As a result, chip designers find themselves facing lots of problems, trying to build reliable systems that integrate complex level of functionality, on a minimum die size and with a limited power budgets. Among different circuit components in every chip, memory components are of great concern. They consume the majority of the chip area and power, in addition to affecting the entire chip performance and reliability. These include large memory arrays, caches, register files and different sequential elements in the logic paths. Sequential elements play an important and critical role in modern synchronous CMOS circuits. Indeed, they can represent up to 50% of the standard cells used in a chip. In addition, the power consumption of the clock tree, including these elements can be more than half of the total chip power. In addition, they come in the second place after memory to be affected by different sources of variation. Hence, efficient implementation of these elements is of great importance for the design of energy efficient and reliable integrated circuits. Pulsed latches have been proposed as efficient replacement of flip-flops in the implementation of sequential element. They can achieve higher performance when compared to traditional flip-flop, and can be designed to be smaller in area and more power efficient. However, the operation of pulsed latch is more sensitive to process, voltage and temperature (PVT) variations. In this thesis, we are proposing a methodology to study the reliability of pulsed latches and we have used it to evaluate the effect of PVT variations on their behavior. In addition, novel approaches to enhance the reliability of pulsed latches without significant degradation in performance, area or power are presented. Also, since sequential elements can be used to build small size register files, pulsed latch implementation of register files are discussed and compared to other traditional implementations, including SRAM and flip-flops. In addition, since multiport register files are very beneficial for quite few applications, novel implementations of multiport register files are also presented. The proposed implementation is proved to highly reduce the significant overhead in area, power and latency associated with the traditional way of designing multiport register files.

PhD Defense: Runtime Memory Management in Many-core Systems

Name: Hossein Tajik

Date: November 15, 2016

Time: 3:00PM – 4:00PM

Location: DBH 3011 Conference Room

Committee: Nikil Dutt (Chair), Tony Givargis, Alex Nicolau


With the increasing number of cores on a chip, we are moving towards an era where many-core platforms will soon be ubiquitous. Efficient use of tens to hundreds of cores on a chip and their memory resources comes with unique challenges.

In this dissertation, we propose SPMPool: a scalable platform for sharing Software Programmable Memories (SPMs). The SPMPool approach exploits underutilized memory resources by dynamically sharing SPM resources between applications running on different cores and adapts to the overall memory requirements of multiple applications that are concurrently executing on the many-core platform. We propose both central and distributed management schemes for SPMPool and study the efficiency of auction-based mechanisms in solving the memory mapping problem. We also propose offline and online memory phase detection methods in order to increase the adaptivity of memory management to temporal changes in memory requirements of a single application. The runtime memory management schemes proposed in this dissertation enable better performance and power for many-core systems.

PhD Defense: On Optimizing the Performance of Interference-Limited Wireless Systems

Name: Rana A. Abdelaal

Date: January 30, 2017

Time: 1:00 PM

Location: Engineering Hall 3106

Committee: Professor Ahmed Eltawil


Multi Input Multi Output (MIMO) technology has seen prolific use to achieve higher data rates and an improved communication experience for cellular systems. However, one of the challenging problems in MIMO systems is interference. Interference limits the system performance in terms of rate and reliability. In this thesis, we analyze methods that provide high performance over interference-limited wireless networks such as Long Term Evolution (LTE) and WiFi. In this thesis, we tackle different sources of interference. One of the interference sources is the neighboring interference, we propose methods that include an optimized solution that models the interference as correlated noise, and uses its statistical information to jointly optimize the base station precoding and user receiver design of LTE systems. We study the benefits of exploiting interference in terms of both probability of error and signal-to-noise ratio (SNR). In addition, we compare the proposed method with the conventional beamforming and maximum ratio combining (MRC). One of the key challenges to enable high data rates in the downlink of LTE is the precoding and receiver design. We focus primarily on the UE and the base station (BS) processing, particularly on estimating and using the interference resulting from neighboring stations. In this thesis, we propose a receiver design that performs well in the presence of interference. Furthermore, we present a precoding scheme that the BS can use to maximize the signal-to-interference plus noise-ratio (SINR). An interference free scenario is used as a benchmark to evaluate the proposed system performance. In this thesis, we optimize the performance of LTE by tackling practical considerations that affects the system performance. We present a suboptimal practical way of estimating the interference and utilizing this information on the processing techniques used at both the UE and the eNodeB sides. We focus on managing both MU-MIMO interference and other cell interference. The proposed study improves system performance even under non-perfect channel knowledge, enabling the throughput gains promised by MU-MIMO.

Other types of interference exist in In-band full-duplex (IBFD) communication systems. IBFD is very promising in enhancing wireless LANs, where full-duplex access points (APs) can support simultaneous uplink (UL) and downlink (DL) flows over the same frequency channel. One of the key challenges limiting IBFD benefits is interference. In this thesis, we also propose a scheduling technique to manage interference in wireless LANs with full-duplex capability. We focus primarily on scheduling UL and DL stations (STAs) that can be efficiently served simultaneously.

It is very important to apply system knowledge to reduce power and/or improve performance. Thus, we also aim at exploring energy and power management techniques for practical wireless communication systems. An important topic for practical communication systems is handling the interference due to the power amplifier nonlinearities. Managing this type of interference is of very high importance, especially in Orthogonal Frequency-Division Multiple Access (OFDMA) based communication systems. Although, OFDMA is the modulation of choice due to its robustness to time-dispersive radio channels, low-complexity receivers, and simple combining of

signals from multiple transmitters in broadcast networks. The transmitter design for OFDMA is more costly, as the Peak-to-Average Power Ratio (PAPR) of an OFDMA signal is relatively high, resulting in the need for highly linear RF power amplifiers (PA). This problem becomes more compounded when a large number of PAs is required, as in Massive MIMO for example. In this thesis, we discuss the impact of PAs on cellular systems. We show the constraints that PAs introduce, and we take these constraints into consideration while searching for the optimum set of transmitter and receiver filters. Moreover, we highlight how Massive MIMO cellular networks can relax PAs constraints resulting in low cost PAs, while maintaining high performance. The performance is evaluated by showing the probability of error curves and signal-to-noise-ratio curves for different transmit powers and different number of transmit antennas. Another promising topic for efficient practical communication systems is Associative processors (APs). AP is a good candidate for in-memory computation, however it has been deemed too costly and energy hungry in the past. The advent of ultra-dense resistive memories is changing this paradigm, allowing for efficient in-memory associative processors. However, with high levels of integration, issues related to power density become the bottleneck. In this thesis, we show the potential use of the approximate computing in wireless communication systems, specifically, we present Fast Fourier Transform (FFT) implemented by associative processor. Results confirm that approximate computing for in-memory associative processors is a viable approach to reduce power consumption while maintaining good performance. A promising approach to save energy is through reducing the bit width, however reducing the bit width introduces errors that may affect the performance. In this thesis, our goal is to adjust the bit width based on the channel SNR, aiming at achieving good performance at reduced energy consumption. The mathematical approach that analytically describes the system performance under the reduced bit width noise is presented. Based on this model, an adaptive bit width adjustment algorithm is presented that utilizes the received SNR estimates to find the optimal bit width that achieves performance goals at reduced energy consumption. Simulation results show that the proposed algorithms can achieve up to 45% energy savings as compared to wireless communication systems with conventional FFT.


PhD Defense: Scalable Runtime support for Edge-To-Cloud integration of Distributed Sensing Systems

Name: Brett Chien

Date: November 29, 2016

Time: 10:00AM

Location: EH 2210

Committee: Pai H. Chou (Chair)


While Internet-Of-Things (IoT) has drawn more attention to researchers and the public, to build a complete system from the edge sensing units to the cloud services requires massive amount of efforts. Researchers with strong interests in collected information are often lost in various technologies, including distributed sensing embedded systems, bridge devices between Internet and local network, and data backend services.

This work takes a cross-system, script-based, and semantic-enhanced approach to address the problem of lacking suitable runtime supports. We proposed a threaded code runtime support for edge sensing systems, a script based wrapper on Physical-to-Cyber bridges, and scalable middleware into the backend services.

With proposed runtime supports, we are able to apply distributed sensing systems into real world applications quickly and explorer insights from collected information. As a result, a building structure monitoring system is installed and allow civil researchers to develop algorithms to prevent disaster events. Body area sensing systems such as ECG monitoring, CO2 detection, and body movement are developed. This enables baby screening and detect potential heart problems. The results have shown that with proposed runtime supports applications can be realized quickly and scalable.