Invited Talks

Keynote Talk 1

Efficient DNN Training at Scale:
from Algorithms to Hardware

Gennady Pekhimenko (University of Toronto)

The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus of systems research is usually quite narrow and limited to inference (i.e., how to efficiently execute already trained models) and image classification networks as the primary benchmark for evaluation.

In this talk, we will demonstrate a holistic approach to DNN training acceleration and scalability starting from the algorithm, to software and hardware optimizations, to special development and optimization tools.

Gennady Pekhimenko joined University of Toronto, CS Department as an Assistant Professor in Summer 2017, where he leads the EcoSystem research group. I'am also a Faculty Member at Vector Institute. His work is funded by Amazon, Facebook, Huawei, Xilinx, NVIDIA, IBM, NSERC, and CIFAR. From July 2016 he was a Researcher in Systems Research Group at Microsoft Research in Redmond.

Gennady got his PhD from Computer Science Department at Carnegie Mellon University, working with Professor Todd Mowry and Professor Onur Mutlu. His work was funded by NVIDIA Graduate, Microsoft Research, Qualcomm Innovation, and NSERC CGS-D Fellowships.

Keynote Talk 2

Training Large Language Models: Challenges and Opportunities

Mostofa Patwary (NVIDIA)

Language models with large number of parameters trained on massive datasets can achieve state-of-the-art accuracies in various natural language processing applications including summarization, automatic dialogue generation, translation, semantic search, and code autocompletion. However, training such models is challenging as these models no longer fit in the largest GPU memory and can require a very long training time. Therefore, numerous innovations and breakthroughs are required in dataset, algorithms, software, and hardware altogether to make training these models a reality. In this talk, I present our efforts to train the Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful monolithic transformer language model trained to date, with 530 billion parameters. I will also showcase several applications of MT-NLG and discuss future research and numerous opportunities that this model presents.

Mostofa Patwary is a Principal Research Scientist for Applied Deep Learning Research at NVIDIA. His research areas are: Natural Language Processing, Large Scale Deep Learning/Machine Learning, Big Data Analytics, High Performance Computing, Parallel Algorithms, and Algorithm Engineering.
Previously, he worked as a senior researcher at the Silicon Valley AI Lab at Baidu Research, Parallel Computing Lab at Intel Research, and at the Northwestern University in Illinois.

Mostofa received his PhD from the Department of Informatics at University of Bergen, Norway. As part of the PhD program, he also studied as a research scholar at Purdue University, USA.

His bachelor and masters degrees are from the Department of Computer Science and Engineering at Bangladesh University of Engineering and Technology (BUET), Bangladesh.

Efficient DNN Training at Scale:from Algorithms to Hardware

Gennady Pekhimenko (University of Toronto)

Training Large Language Models: Challenges and Opportunities

Mostofa Patwary (NVIDIA)

Efficient DNN Training at Scale:
from Algorithms to Hardware