Current projects

I’m currently building an interdisciplinary research group to bridge the gap between advances in architecture and important new applications. If this or any of the projects below sound interesting to you, please shoot me an email ( or drop by my office during office hours. I’d love to chat with you!

Cross-layer system design

One of the goals of computer architecture is to separate computing systems into distinct layers. In the past, this allowed experts within each layer to optimize their layer without worrying about compatibility (e.g., compiler writers targeted an ISA). Now, due to the end of Dennard scaling and Moore’s Law’s inevitable slowdown, the interfaces between these layers are becoming a bottleneck.

My research takes an end-to-end approach to bridge the gap between the hardware and applications. I am currently building an interdisciplinary research group with architects as well as experts from important application domains (e.g., machine learning, computer vision, big data) and experts in computing systems. By breaking the shackles of our layered systems there are orders of magnitude possible performance gains.

Below are a couple of our current projects. If these interest you, please shoot me an email ( or drop by my office during office hours. I’d love to chat with you!

Heterogeneous Memory

Computers are becoming increasingly heterogeneous (see Apple’s A11 as an example), and the next evolution in computing systems is happening in the memory system. Different memory technologies are proliferating including conventional DRAM (DDR3, DDR4), high-bandwidth memories (HBM, HMC), nonvolatile memories (3D Xpoint, PCM, ReRAM), and new memory paradigms like disaggregated memory. How do system designers choose which technology to integrate into their products?

They can’t. No one technology is optimal in all cases. Thus, an we should design systems with a combination of different memory technologies. As computer architects, we have long dealt with a hybrid memory system; all processors combine SRAM caches and DRAM. However, the characteristics of SRAM and DRAM (orders of magnitude difference in speed and cost) do not apply to emerging memory technology (e.g., 3D Xpoint is an integer factor slower than DRAM).

Thus, it is inappropriate to use previous memory management techniques with these emerging technologies. Multi-gigabyte hardware-managed caches have a low hit rate and traditional OS paging requires high overhead page faults. In this project, we will solve these problems with a new hardware-software co-design for heterogeneous memory management.

In situ simulation

Not only is hardware technology changing rapidly, but the workloads that execute on this hardware are also evolving. A few examples include machine learning, augmented reality, big-data analytics, and intelligent personal assistants. These applications are end-to-end solutions, consisting of many interacting kernels of computation, and they cannot easily or accurately be represented as a single microbenchmark. Optimizing these applications requires changes across the entire hardware-software stack from new accelerators and emerging programmable processors to system integration and new programming interfaces. However, current architecture evaluation infrastructure is not easily adapted to studying end-to-end applications.

This project will develop in situ simulation to study applications in their native execution environment. We are  currently extending gem5 to include in situ support for studying CPUs by leveraging ubiquitous hardware virtualization technology. Looking forward, this virtualization can be extended to other accelerators, programmable processors, and even to novel devices via fast emulation (e.g., with FPGAs).


Simulators are the Swiss Army knives of computer architects. I have found building new and extending existing infrastructure vital to my research. To enable others to build off of my work, I believe it is crucial to develop simulation infrastructure as open source code. Below are a few projects that I am part of.

Check out my github page for more tools and to see what I’m currently working on.


I am currently one of the leaders in the gem5 community. gem5 is one of the most popular computer architecture simulators. It is completely open source. You can find the code at

Learning gem5

I am working on a book to help new and experienced users learn how to use gem5: Learning gem5. This book is also open source and you can find the code on github:


I also created one of the first heterogeneous simulators, gem5-gpu, by combining gem5 and GPGPU-Sim. This project is no longer actively maintained, but you can find more information on the gem5-gpu wiki and the code is available on github:

Comments are closed.