Hardware Acceleration for Machine Learning (Spring 2019)

ECE 8893 B

Spring 2019

Course Information

Instructor	Professor Tushar Krishna
Email	tushar@ece.gatech.edu
Office	KACB 2318
Office Hours	By Appointment

Lectures
Hours	Mon & Wed 6:00 – 7:20 pm
Room	Mason 3133

TA	Anand Samajdar
Email	anandsamajdar@gatech.edu
Office Hours	Thursday 11-12 PM in VL C247

Prerequisite(s): ECE 4100 / ECE 6100 (Advanced Computer Architecture) or CS 4290 / CS 6290 (High Performance Computer Architecture). Simultaneous registration will be allowed.

Course Description

The recent resurgence of the AI revolution has transpired because of synergistic advancements across big data sets, machine learning algorithms, and hardware. In particular, deep neural networks (DNNs), have demonstrated extremely promising results across image classification and speech recognition tasks, surpassing human accuracies. The high computational demands of DNNs coupled with their pervasiveness across both cloud and IoT platforms has led to a rise in specialized hardware accelerators for DNNs. Examples include Google’s TPU, Apple’s Neural Engine, Intel’s Nervana, ARM’s Project Trillium, and many more. In addition, GPUs and FPGA architectures and libraries have also been evolving rapidly to accelerate DNNs.

Course Objectives: This course will present recent advances towards the goal of enabling efficient processing of DNNs. Specifically, it will provide an overview of DNNs, discuss various platforms and architectures that support DNNs, and highlight key trends in recent efficient processing techniques that reduce the computation and communication cost of DNNs either solely via hardware design changes or via joint hardware design and network algorithm changes. It will also summarize various development resources that can enable researchers and practitioners to quickly get started on DNN accelerator design, and highlight important benchmarking metrics and design considerations that should be used for evaluating the rapidly growing number of DNN hardware platforms being proposed in academia and industry.

Learning Outcomes: As part of this course, students will: understand the key design considerations for efficient DNN processing; understand tradeoffs between various hardware architectures and platforms; learn about micro-architectural knobs such as precision, data reuse, and parallelism to architect DNN accelerators given target area-power-performance metrics; evaluate the utility of various DNN dataflow techniques for efficient processing; and understand future trends and opportunities from ML algorithms down to emerging technologies.

Course Structure and Content: The course will involve a mix of lectures interspersed with heavy paper reading and discussions. A semester-long programming-heavy project will focus on developing a hardware accelerator for ML, and (optionally) prototyping it on a FPGA. The material for this course will be derived from papers from recent computer architecture conferences (ISCA, MICRO, HPCA, ASPLOS) on hardware acceleration and ML conferences (ICML, NIPS, ICLR) focusing on hardware friendly optimizations, and from blog articles from industry (Google, Deep Mind, Baidu, Intel, ARM, Facebook).

Course Syllabus: ML-HW_CourseOutline_S19

Course Schedule: http://tinyurl.com/gt-hml2019

Slides

L01-Intro.pdf

Lab Assignments

Lab1A: Running MLP and CNN on CPUs
- Download Keras Files

FAQs

What is the required background? Do I need to know Machine Learning?
Some basic understanding of Machine Learning, especially Deep Neural Networks will be useful. It is not an enforced requirement – but your time in the course will be better spent if you know the algorithms at the high-level and can learn their HW implementation and optimizations (which is the focus of the course), rather than hearing about .these algorithms for the first time.

Do I need to know how to use the ML frameworks such as Tensor Flow, PyTorch, etc.?
Not extensively, but having some experience running one of these frameworks will be useful.

What sort of coding background is needed? RTL? C++?
A bit of everything: scripting (e.g., python), C++, Verilog and some familiarity with CAD tools will be useful. That is the beauty and challenge in hardware accelerator design.

Why is this a 2-3-3 and not a 3-0-3 course?
This is to emphasize the heavy lab and project focus of the course.

Honor Code:

Students are expected to abide by the Georgia Tech Academic Honor Code. Honest and ethical behavior is expected at all times. All incidents of suspected dishonesty will be reported to and handled by the office of student affairs. You will have to do all assignments individually unless explicitly told otherwise. You may discuss with classmates but you may not copy any solution (or any part of a solution).