Hung-Wei Tseng

Associate Professor, University of California, Riverside

Hung-Wei Tseng

I am currently an associate professor in the Department of Electrical and Computer Engineering and a cooperating faculty of the Department of Computer Science and Engineering at University of California, Riverside, where I am now leading the Extreme Scale & Computer Architecture Laboratory.

I am interested in designing architecture, programming language frameworks, and system infrastructures that allow applications and programmers to use modern heterogeneous hardware components more efficiently. My recent focus is on using AI/ML accelerators to improve the performance of non-AI/ML workloads (e.g., GPTPU, TCUDB).

Our research has been recognized by IEEE MICRO Top Picks From Computer Architecture Conferences in 2024 and 2020. We also received best paper nominations from IEEE/ACM International Symposium on Microarchitecture in 2021 and 2019 as well as Facebook Research Award, 2018. In addition, we also applied our knowledge in optimizing storage systems to wireless network system stacks. We developed the OpenUVR that enabled high-quality, untethered VR experience on commodity hardware components and won the outstanding paper award in RTAS 2021.

Important update -- due to the uncertainty of government funding situations, we're not recruiting students for the upcoming season.

If you're interested in joining my research group, you should apply to UCR's ECE or CSE programs. Please also fill out this form to express your interest in my group after you have applied. I won't interview or respond to students until we received your applications.

Address: Room 406, Winston Chung Hall, 900 University Ave., Riverside, CA 92521
Phone: (951) 827-1012 ext 29347
Email: htseng @ ucr.edu
CV: PDF

News

Hongrui "Jack" Zhang successfully presented our paper, RTSpMSpM: Harnessing Ray Tracing for Efficient Sparse Matrix Computations, at ISCA 2025 in Tokyo!

Top

Research Projects

Democractizing hardware accelerators

Beyond AI/ML accelerators, modern systems also integrate other types of accelerators for more application domains. Ray Tracing accelerator is one example type of hardware that becomes more popular modern systems to fulfill the demand of gaming and virtual/mixed realities. These accelerators complement the deficiency of AI/ML accelerators in accelerating algorithms with divergent control flows or irregular memory access patterns. Democractizing these accelerators will improve the performance of traditionally hard-to-parallelize problems that currently have to rely on slowly improved CPU architectures.

Hongrui Zhang, Yunan Zhang and Hung-Wei Tseng. RTSpMSpM: Harnessing Ray Tracing for Efficient Sparse Matrix Computations. In Proceedings of the 52nd Annual International Symposium on Computer Architecture , ISCA 2025. GitHub
Dongho Ha, Lufei Liu, Yuan Hsi Chou, Seokjin Go, Won Woo Ro, Hung-Wei Tseng, and Tor M. Aamodt. Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs. In the 57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024.
From Application Specific to General Purpose (Again)

GPTPU Prototype

Accelerating non-AI/ML applications using AI/ML accelerators

The explosive demand on AI/ML workloads drive the emergence of AI/ML accelerators, including commercialized NVIDIA Tensor Cores and Google TPUs. These AI/ML accelerators are essentially matrix processors and are theoretically helpful to any application with matrix operations. This project bridges the missing system/architecture/programming language support in democratizing AI/ML accelerators. As matrix operations are conventionally inefficient, this project also revises the core algorithm in compute kernels to better utilize operators of AI/ML accelerators.

Dongho Ha, Yunan Zhang, Chen-Chien Kao, Woo Woo Ro, Christopher Hughes, Hung-Wei Tseng. M^3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUs. In the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2024.
Kuan-Chieh Hsu and Hung-Wei Tseng. Simultaneous and Heterogenous Multithreading: Exploiting Simultaneous and Heterogeneous Parallelism in Accelerator-Rich Architectures. In IEEE Micro (Micro Toppicks from Computer Architecture Conferences), 2024. [Micro TopPicks]
Kuan-Chieh Hsu and Hung-Wei Tseng. Simultaneous and Heterogeneous Multithreading. In the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023. GitHub
Dongho Ha, Woo Won Ro, Hung-Wei Tseng. TensorCV: Accelerating Inference-Adjacent Computation Using Tensor Processors. In 2023 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED 2023) GitHub
Yunan Zhang, Po-An Tsai and Hung-Wei Tseng. SIMD^2: A Generalized Matrix Instruction Set for Accelerating Tensor Computation beyond GEMM. In 49th International Symposium on Computer Architecture, ISCA 2022, 2022. GitHub
Yu-Ching Hu, Yuliang Li and Hung-Wei Tseng. TCUDB: Accelerating Database with Tensor Processors. In the 2022 ACM SIGMOD/PODS International Conference on Management of Data, SIGMOD 2022, 2022. GitHub
Kuan-Chieh Hsu and Hung-Wei Tseng. Accelerating Applications using Edge Tensor Processing Units. In The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2021. GitHub
Yu-Chia Liu and Hung-Wei Tseng. NDS: N-Dimensional Storage. In the 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2021. (Best Paper Nominee) GitHub

Innovative hardware accelerator architectures

ESCAL also focuses on designing hardware accelerators for important application domains and complement the missing problems that existing accelerators cannot tackle.

Yunan Zhang, Po-An Tsai and Hung-Wei Tseng. Sparsepipe: Sparse Inter-operator Dataflow Architecture with Cross-Iteration Reuse. In the 57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024.

Selected Publications

(Full listing)

Hongrui Zhang, Yunan Zhang and Hung-Wei Tseng. RTSpMSpM: Harnessing Ray Tracing for Efficient Sparse Matrix Computations. In Proceedings of the 52nd Annual International Symposium on Computer Architecture , ISCA 2025. GitHub
Yunan Zhang, Po-An Tsai and Hung-Wei Tseng. Sparsepipe: Sparse Inter-operator Dataflow Architecture with Cross-Iteration Reuse. In the 57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024.
Dongho Ha, Lufei Liu, Yuan Hsi Chou, Seokjin Go, Won Woo Ro, Hung-Wei Tseng, and Tor M. Aamodt. Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs. In the 57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024.
Dongho Ha, Yunan Zhang, Chen-Chien Kao, Woo Woo Ro, Christopher Hughes, Hung-Wei Tseng. M^3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUs. In the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2024.
Kuan-Chieh Hsu and Hung-Wei Tseng. Simultaneous and Heterogenous Multithreading: Exploiting Simultaneous and Heterogeneous Parallelism in Accelerator-Rich Architectures. In IEEE Micro (Micro Toppicks from Computer Architecture Conferences), 2024. [Micro TopPicks]
Kuan-Chieh Hsu and Hung-Wei Tseng. Simultaneous and Heterogeneous Multithreading. In the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023. [arXiv] [GitHub]
Dongho Ha, Woo Won Ro, Hung-Wei Tseng. TensorCV: Accelerating Inference-Adjacent Computation Using Tensor Processors. In 2023 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED 2023) [GitHub]
Yu-Chia Liu, Kuan-Chieh Hsu and Hung-Wei Tseng. Rethinking Programming Frameworks for In-Storage Processing. In 60th Design Automation Conference (DAC 2023)
Yunan Zhang, Po-An Tsai and Hung-Wei Tseng. SIMD^2: A Generalized Matrix Instruction Set for Accelerating Tensor Computation beyond GEMM. In 49th International Symposium on Computer Architecture, ISCA 2022, 2022. [arXiv] [GitHub]
Yu-Ching Hu, Yuliang Li and Hung-Wei Tseng. TCUDB: Accelerating Database with Tensor Processors. In the 2022 ACM SIGMOD/PODS International Conference on Management of Data, SIGMOD 2022, 2022. [arXiv] [GitHub]
Kuan-Chieh Hsu and Hung-Wei Tseng. Accelerating Applications using Edge Tensor Processing Units. In The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2021. [arXiv] [GitHub]
Yu-Chia Liu and Hung-Wei Tseng. NDS: N-Dimensional Storage. In the 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2021. (Best Paper Nominee)
Alec Rohloff, Zackary Allen, Kung-Min Lin, Joshua Okrend, Chengyi Nie, Yu-Chia Liu, and Hung-Wei Tseng. OpenUVR: an Open-Source System Framework for Untethered Virtual Reality Applications. In 27th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS 2021. 2021. (Outstanding Paper Award) [arXiv] [Github]
Abenezer Wudenhe and Hung-Wei Tseng. TPUPoint: Automatically Characterizing Hardware Accelerated Data Center Machine Learning Program Behavior. In the 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021. 2021.[Github]
Jinyoung Choi, Sergey Blagodurov and Hung-Wei Tseng. Dancing in the Dark: Profiling in the Age of Tiered Memory. In 35th IEEE International Parallel & Distributed Processing Symposium, IPDPS 2021. 2021. [Github]
Yu-Ching Hu, Murtuza Lokhandwala, Te I and Hung-Wei Tseng. Varifocal Storage: Dynamic Multi-Resolution Data Storage. In IEEE Micro (Micro Toppicks from Computer Architecture Conferences). 2020.
Yu-Ching Hu, Murtuza Lokhandwala, Te I and Hung-Wei Tseng. Dynamic Multi-Resolution Data Storage. In the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019. 2019.
Kiran Kumar Matam, Gunjae Koo, Haipeng Zha, Hung-Wei Tseng and Murali Anavarum. GraphSSD: Graph Semantics Aware SSD. In the 46th International Symposium on Computer Architecture, ISCA 2019. 2019.
Te I, Murtuza Lokhandwala, Yu-Ching Hu, and Hung-Wei Tseng. Pensieve: a Machine Learning Assisted SSD Layer for Extending the Lifetime. In IEEE International Conference on Computer Design (ICCD 2018). October, 2018.
Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan and Steven Swanson. Morpheus: Exploring the Potential of Near-Data Processing for Creating Application Objects in Heterogeneous Computing. SIGOPS Operating Systems Review, volume 51(2):71 -* 83, August 2018.
Gunjae Koo, Kiran Kumar Matam, Te I, Hema Venkata Krishna Giri Narra, Jing Li, Steven Swanson, Murali Annavaram, and Hung-Wei Tseng. Summarizer: Trading Bandwidth with Computing Near Storage. In 50th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2017
Yanqin Jin, Hung-Wei Tseng, Steven Swanson and Yannis Papakonstantinou. KAML: A Flexible, High-Performance Key-Value SSD. In 23rd International Symposium on High Performance Computer Architecture (HPCA 2017). February 2017.
Yang Liu, Hung-Wei Tseng, Mark Gahagan, Jing Li, Yanqin Jin and Steven Swanson. Hippogriff: Efficiently Moving Data in Heterogeneous Computing Systems. In 34th IEEE International Conference on Computer Design (ICCD 2016). Oct. 2016.
Yang Liu, Hung-Wei Tseng and Steven Swanson. SPMario: Scale Up MapReduce with I/O-Oriented Scheduling for the GPU. In 34th IEEE International Conference on Computer Design (ICCD 2016). Oct. 2016.
Jing Li, Hung-Wei Tseng, Chunbin Lin, Steven Swanson, and Yannis Papakonstantinou. HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics. Proceedings of VLDB Endowment, Volume 9(14), 2016.
Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan and Steven Swanson. Morpheus: Creating Application Objects Efficiently for Heterogeneous Computing. In 43rd International Symposium on Computer Architecture (ISCA 2016). June 2016.
Hung-Wei Tseng and Dean M. Tullsen. CDTT: Compiler-generated data-triggered threads. In 20th International Symposium on High Performance Computer Architecture (HPCA 2014). February 2014.
Hung-Wei Tseng and Dean M. Tullsen. Data-Triggered Multithreading for Near Data Processing. In 1st Workshop on Near-Data Processing (WoNDP). Dec 2013.
Leo Porter, Saturnino Garcia, Hung-Wei Tseng, and Daniel Zingaro. Evaluating Student Understanding of Core Concepts in Computer Architecture. In The 18th Annual Conference on Innovation and Technology in Computer Science Education (iTiCSE). July 2013.
Hung-Wei Tseng, Laura M. Grupp and Steven Swanson. Underpowering NAND Flash: Profits and Perils. In 50th Design Automation Conference (DAC 2013). June 2013.
Hung-Wei Tseng and Dean M. Tullsen, Software data-triggered threads. In ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA 2012). October 2012.
Hung-Wei Tseng and Dean M. Tullsen, Eliminating Redundant Computation and Exposing Parallelism through Data-Triggered Threads. IEEE Micro, Volume: 32(3) (Micro Toppicks from Computer Architecture Conferences). Page(s): 38 – 47, June 2012.
Hung-Wei Tseng, Laura M. Grupp and Steven Swanson. Understanding the Impact of Power Loss on Flash Memory. In 48th Design Automation Conference (DAC 2011). June 2011.
Hung-Wei Tseng and Dean M. Tullsen, Data-Triggered Threads: Eliminating Redundant Computation, In 17th International Symposium on High Performance Computer Architecture (HPCA 2011). February, 2011. (Nominated for Best Student Paper)
Hung-Wei Tseng, Han-Lin Li and Chia-Lin Yang, "An Energy-Efficient Virtual Memory System with Flash Memory as the Secondary Storage", In Proceedings of International Symposium on Low Power Electronics and Design 2006 (ISLPED 2006), Pages: 418 * 423 .
Shih-Hsien Yang, Hung-Wei Tseng, Eric Hsiao-Kuang Wu, Gen-Huey Chen, "Utilization based duty cycle tuning MAC protocol for wireless sensor network", In Proceedings of IEEE Global Telecommunications Conference (IEEE GLOBECOM'05), Volume 6, 28 Nov.-2 Dec. 2005, Page(s):3258 * 3262

Top

Open Source Projects

GPTPU: https://github.com/escalab/GPTPU
OpenUVR: https://github.com/escalab/OpenUVR
Tiered-Memory-Profiler: https://github.com/escalab/Tiered-Memory-Profiler
TPUPoint: https://github.com/escalab/TPUPoint
Software Data-Triggered Thread: https://github.com/hungweitseng/DTT

Top

Advising & Join

ESCAL Group Photo

If you're interested at joining my research group, you should apply to UCR's ECE or CSE" programs. Please also fill the following form to express your interests in my group. https://forms.gle/nhakt2qRwEsYrxGf7. I will not review and consider applicants who did not fill this form.

Graduate Students

I am currently advising the following top-notch graduate students:

Boram Jung
Hongrui Zhang
Andy Li

Undergraduate Students

I also work with the following talented undergraduate students:

Honghao Lin (UCSD)
Andrew Lu (UCSD)
Owen Lam (UCSD)
Asher James (UCSD)

Alumni

I have also advised these students who graduated

Jinyoung Choi (C.S., MS, UC Riverside, 2025. First employment: AMD)
Abenezer Wudenhe (C.S., PhD, UC Riverside, 2024.)
Yu-Ching Hu (C.S., PhD, UC Riverside, 2024. First employment: NetApp)
Kuan Chieh Hsu (C.S., PhD, UC Riverside, 2024. First employment: Brookhaven National Lab)
Yunan Zhang (E.E., PhD, UC Riverside, 2024. First employment: Google)
Dongho Ha (Visiting Scholar. Currently a PhD candidate at Yonsei University)
Yu-Chia Liu (C.S., PhD, UC Riverside, 2022. First employment: Pure Storage)
Teng-Hung Chen (C.S., M.S., First employment: A2 Labs)
Ziliang Zhang (E.E., M.S., Now a PhD student at UCR)
Zecao Lu (C.S., M.S., NC State University, 2019. Now at Didi Labs)
Xindi Li (C.S., M.S., NC State University, 2018. Now at Bloomberg)
Chao Huang (C.S., M.S., NC State University, 2018. Now at Amazon)
Zackary Allen (C.S., B.S., NC State University, 2018. Now at Red Hat)
Alec Rohloff (C.S., B.S., NC State University, 2018.)
Te I (C.S., M.S., NC State University, 2018. Now at Google)
Vaibhava Lakshmi (ECE, M.S., NC State University, 2018. Dell EMC)
Murtuza Taher Lokhandwala (ECE, M.S., NC State University, 2018. Apple)
Mahesh Bonagiri(ECE, M.S., NC State University, 2018. Nvidia)
Hao Zhang (Continuing as PhD student at NC State University)
Joshua Okrend (Now working at a Government Contractor)
Timotius Oentung (Continuing at NC State University)
Kung-Min "Leo" Lin (Now pursuing C.S., B.S., University of California, Berkeley)
Chengyi "Eric" Nie (Now pursuing E.E., Ph.D., Stony Brooks University)

For prospects

Developing awesome ideas and training researchers are my duties as a professor. I am always looking for new graduate students. If you are interested at working with me, please apply to either Department of Electrical and Computer Engineering (Preferred) or the Department of Computer Science and Engineering of University of California, Riverside and mention me as a potential advisor in the application system.

Top

Teaching

Youtube Channel

Prof. Usagi's Youtube Channel

Upcoming

CSE 142, Computer Architecture Software Perspective, UCSD
CSE 142L, Software Projects of Computer Architecture, UCSD

Present

Prior Courses

EE/CS 277, Data-Centric Computer Architectures, UCR, Spring 2025, Winter 2024, Spring 2022
EE 214 & PHYS 220, Quantum Computing, Winter 2025
CS 203, Advanced Computer Architecture, Winter 2025, Fall 2024, Spring 2024, Fall 2023, Spring 2023, Fall 2022 , Fall 2021, Fall 2020, Fall 2019
CSE 142, Computer Architecture Software Perspective. Summer 2024, Summer 2023,Summer 2022
CSE 142L, Software Projects of Computer Architecture. Summer 2024, Summer 2023,Summer 2022
EE260F, Quantum Computing and Computer Architecture, UCR: Spring 2021
EE/CS 120A, Logic Design, UCR: Spring, 2020
CS 202, Advanced Operating Systems, UCR: Winter 2022, Winter 2021, Winter 2020
EE 260, Trends in Computer System Design, UCR: Winter 2020
CSE 141, Introduction to Computer Architecture, UCSD: Summer Session II 2020, Summer Session II 2019, Summer Session I 2019, Summer Session I 2016, Spring 2016, Summer Session II 2014, Summer Session II 2014, Summer Session II 2012, Summer Session II 2012
CSC 456/CSC 591-010/CSC591-601, Computer Architecture and Multiprocessors, NC State: Spring 2019 (Syllabus), Spring 2018 (Syllabus)
CSC 501, Operating Systems Principles, NC State: Fall, 2018 (Syllabus), Fall, 2017(Syllabus), Fall 2016
CSC 236, Computer Organization and Assembly Language for Computer Scientists, NC State: Spring 2017 (Syllabus)

Top

Services

I have been actively serving both in the research community and my department.

Research community

General chair: HPCA 2025, NVMW 2024, NVMW 2023
Program chair: NVMW 2018
Technical Program Committe member: ISCA (2024, 2023, 2022), MICRO (2024, 2022), ASPLOS (2025), HPCA (2024, 2023, 2021, 2020), ICCD (2022--2018), IISWC (2022, 2021)
Associate editor: IEEE CAL, ACM TACO
Web chair: ASPLOS 2018, HPCA 2017, MICRO 2016
Registration chair: ISCA 2018
Finance chair: HPCA 2020
Poster session chair: ASPLOS 2024
Local arrangement chair: IISWC 2018

Department service

Academic-Industry Relationship Committee of UCR's computer engineering program -* If you work in industry and would like to visit/recurit/give talks/collabrate at UCR's CE/EE/CSE programs, please reach out to me!

Top

Prior Research Projects

Intelligent Data Storage

As parallel computer architectures significantly shrinking the execution time in compute kernels, the performance bottlenecks of applications shift to the rest of part of execution, including data movement, object deserialization/serialization as well as other software overheads in managing data storage. To address this new bottleneck, the best approach is to not move data and endow storage devices with new roles.

Morpheus is one of the very first research project that implements this concept in real systems. We utilize existing, commercially available hardware components to build the Morpheus-SSD. The Morpheus model not only speeds up a set of heterogeneous computing applications by 1.32x, but also allows these applications to better utilize emerging data transfer methods that can send data directly to the GPU via peer-to-peer to further achieve 1.39x speedup. Summarizer further provides mechanisms to dynamically adjust the workload between the host and intelligent SSDs, making more efficient use of all computing units in a system and boost the performance of big data analytics. This line of research also helps Hung-Wei's team receive Facebook research award, 2018.

Yu-Chia Liu, Kuan-Chieh Hsu and Hung-Wei Tseng. Rethinking Programming Frameworks for In-Storage Processing. In 60th Design Automation Conference (DAC 2023)
Yu-Chia Liu and Hung-Wei Tseng. NDS: N-Dimensional Storage. In the 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2021. (Best Paper Nominee) GitHub
Yu-Ching Hu, Murtuza Lokhandwala, Te I and Hung-Wei Tseng. Varifocal Storage: Dynamic Multi-Resolution Data Storage. In IEEE Micro (Micro Toppicks from Computer Architecture Conferences), 2020.
Yu-Ching Hu, Murtuza Lokhandwala, Te I and Hung-Wei Tseng. Dynamic Multi-Resolution Data Storage. In in the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019 (Best Paper Honorable Mention)
Kiran Kumar Matam, Gunjae Koo, Haipeng Zha, Hung-Wei Tseng and Murali Anavarum. GraphSSD: Graph Semantics Aware SSD. In the 46th International Symposium on Computer Architecture, ISCA 2019.
Facebook research award, 2018
Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan and Steven Swanson. Morpheus: Exploring the Potential of Near-Data Processing for Creating Application Objects in Heterogeneous Computing. SIGOPS Operating Systems Review, volume 51(2):71 -* 83, August 2018.
Gunjae Koo, Kiran Kumar Matam, Te I, Hema Venkata Krishna Giri Narra, Jing Li, Steven Swanson, Murali Annavaram, and Hung-Wei Tseng. Summarizer: Trading Bandwidth with Computing Near Storage. In 50th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2017
Yanqin Jin, Hung-Wei Tseng, Steven Swanson and Yannis Papakonstantinou. KAML: A Flexible, High-Performance Key-Value SSD. In 23rd International Symposium on High Performance Computer Architecture (HPCA 2017). February 2017.
Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhao, Mark Gahagan and Steven Swanson. Morpheus: Creating Application Objects Efficiently for Heterogeneous Computing. In 43rd International Symposium on Computer Architecture (ISCA 2016). June 2016.

Building Efficient Heterogeneous Computers

As the discontinuation of Dannard scaling and Moore's Law, computers become heterogeneous. However, moving data among heterogeneous computing units and storage devices becomes an emerging bottleneck in these systems.

My research proposes the "Hippogriff" system that revisits the programming model of moving data in heterogeneous computer systems. Instead of using the conventional CPU-centric, programmer-specified methods, the Hippogriff system simplifies the application interface and provide a middle layer to efficiently handle the data movement. We also implemented peer-to-peer data transfer between the GPU and the SSD in the Hippogriff system.

The preliminary result demonstrates 46% performance gain by applying Hippogriff to a set of rodinia GPU applications. For highly optimized GPU MapReduce framework, Hippogriff still demonstrates up to 27% performance gain.

Yu-Chia Liu and Hung-Wei Tseng. NDS: N-Dimensional Storage. In the 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2021. [GitHub]
Abenezer Wudenhe and Hung-Wei Tseng. TPUPoint: Automatically Characterizing Hardware Accelerated Data Center Machine Learning Program Behavior. In the 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2021). [Github]
Jinyoung Choi, Sergey Blagodurov and Hung-Wei Tseng. Dancing in the Dark: Profiling in the Age of Tiered Memory. In 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2021. [Github]
Yu-Ching Hu, Murtuza Lokhandwala, Te I and Hung-Wei Tseng. Varifocal Storage: Dynamic Multi-Resolution Data Storage. In IEEE Micro (Micro Toppicks from Computer Architecture Conferences), 2020.
Yu-Ching Hu, Murtuza Lokhandwala, Te I and Hung-Wei Tseng. Dynamic Multi-Resolution Data Storage. In the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019
Jing Li, Hung-Wei Tseng, Chunbin Lin, Steven Swanson, and Yannis Papakonstantinou. HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics. Proceedings of VLDB Endowment, Volume 9(14), 2016.
Yang Liu, Hung-Wei Tseng, Mark Gahagan, Jing Li, Yanqin Jin and Steven Swanson. Hippogriff: Efficiently Moving Data in Heterogeneous Computing Systems. In 34th IEEE International Conference on Computer Design (ICCD 2016). Oct. 2016.
Yang Liu, Hung-Wei Tseng and Steven Swanson. SPMario: Scale Up MapReduce with I/O-Oriented Scheduling for the GPU. In 34th IEEE International Conference on Computer Design (ICCD 2016). Oct. 2016.
Hung-Wei Tseng, Yang Liu, Mark Gahagan, Jing Li, Yanqin Jin, and Steven Swanson. Gullfoss: Accelerating and Simplifying Data Movement among Heterogeneous Computing and Storage Resources . Department of Computer Science and Engineering, University of California, San Diego technical report technical report CS2015-1015, 2015.

High-Quality, Low-Latency Realtime Systems

With hardware accelerators improving the latency in computation, the system software stack that were traditionally underrated in designing applications becomes more critical. In ESCAL, we focus on those underrated bottlenecks to achieve significant performance improvement without using new hardware. The most recent example is the OpenUVR system, where we eliminate unnecessary memory copies and allow the VR system loop to complete within 14 ms latency with just modern desktop PC, existing WiFi network links, raspberry Pi 4b+ and an HDMI compatible head mount display.

Alec Rohloff, Zackary Allen, Kung-Min Lin, Joshua Okrend, Chengyi Nie, Yu-Chia Liu, and Hung-Wei Tseng. OpenUVR: an Open-Source System Framework for Untethered Virtual Reality Applications. In 27th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS 2021, 2021. [arXiv] [Github]

Top