Scalable computations where the data is sparse — that is, a tiny subset of the data is populated — are widely represented in scientific computing, data analytics and machine learning. Sparse data are typically represented by sparse matrices and graphs, which reduce data storage and computation requirements (e.g., by storing only nonzero data elements) through the use of indirection matrices such as B in A[B[i]]. This indirection impacts performance on modern architectures. Whether the code is optimized by human programmer, compiler or hardware, any optimizations that must understand the memory access patterns for A require the values of B, which are not known until runtime. Generating high-performance code for sparse computations is made more challenging by modern architectures, where data movement cost is dominant. Significant interest in compiler technology for sparse tensors has arisen with the intense interest in machine learning systems. In this talk, we will approach this challenge from a compiler perspective, and describe active areas of research and future requirements.
Mary Hall is the Director of the School of Computing at University of Utah. Her research focus brings together compiler optimizations and performance tuning targeting current and future high-performance architectures on real-world applications. Professor Hall is an IEEE Fellow, an ACM Distinguished Scientist and a member of the Computing Research Association Board of Directors. She actively participates in mentoring and outreach programs to encourage the participation of groups underrepresented in computer science.
Many of the implementation technology scaling trends the computing industry has historically relied on have started to taper off or are posing increasing design challenges. This has led to the proliferation of many-core processors and accelerators, advanced packaging technologies, and innovations in memory architecture which, in aggregate, are enabling increasingly sophisticated node organizations. However, these nodes also consist of complex memory organizations that necessitate careful data management to achieve high hardware utilization. The need for such data management is making memory performance optimization particularly important. This talk will illustrate examples in memory access monitoring and memory-aware data structures that show the significant performance potential that can be realized by treating memory as a first-class citizen. Further, motivated by these specific point examples, the talk will also highlight general areas for further hardware-software collaborative research and innovation aimed at further improving memory performance in future computing systems.
Nuwan Jayasena is a Research Fellow at Advanced Micro Devices. His recent research interests include memory systems, data movement optimization, accelerator architecture, near-memory and in-memory computing, and emerging application domains. Nuwan has an MS and a PhD in electrical engineering from Stanford University and a BS from the University of Southern California. He holds over 50 granted US patents and is a senior member of the IEEE. Prior to AMD, Nuwan was a processor architect at Stream Processors, Inc. and at Nvidia Corp.
Site was started with Mobirise website theme