Menu Close

PhD Defense: Towards Engineering Computer Vision Systems: From Web to FPGAs

Final Defense РSajjad Taheri

Date: August 26, 2019

Time: 2:00 pm

Location: Donald Bren Hall 4011

Committee: Alex Nicolau(chair), Alex Veidenbaum(co-chair), Nikil Dutt

Title: Towards Engineering Computer Vision Systems: From Web to FPGAs
Computer vision has many applications that impact our daily lives, such as automation, entertainment, healthcare, etc. However, computer vision is very challenging. This is in part due to intrinsically difficult nature of the problem and partly due to the complexity and size of visual data that need to be processed. To be able to deploy computer vision in many practical use cases, sophisticated algorithms and efficient implementations are required. In this dissertation, we consider two platforms that are suitable for computer vision processing, yet they were not easily accessible to algorithm designers and developers: The Web and FPGA-based Accelerators. Through the development of open-source software, we highlight challenges associated with vision development on each platform and demonstrate opportunities to mitigate them.
The Web is the world’s most ubiquitous computing platform which hosts a plethora of visual content. Due to historical reasons such as insufficient JavaScript performance and lack of API support for acquiring and manipulating images, computer vision is not mainstream on the Web. We show that in light of recent web developments such as vastly improved JavaScript performance and addition of APIs such as WebRTC, efficient computer vision processing can be realized on web clients. Through novel engineering techniques, we translate a popular open-source computer vision library (OpenCV) from C++ to JavaScript and optimize its performance for the web environment. We demonstrate that hundreds of computer vision functions run in browsers with performance close to their original C++ version.
Field Programmable Gate Arrays (FPGA)s are a promising solution to mitigate the computational cost of vision algorithms through hardware pipelining and parallelism while providing excellent power efficiency. However, an efficient FPGA implementation of vision algorithm requires hardware design expertise and a considerable amount of engineering person-hours. We show how high-level graph-based specifications, such as OpenVX can significantly improve FPGA design productivity. Since such abstractions exclude implementation details, different implementation configurations that satisfy various design constraints, such as performance and power consumption, can be explored systematically. They also enable a variety of local and global optimizations to apply across the algorithms.