With the explosion of data creation and uploading across internet of things, hand-held devices and PCs, and cloud and enterprise, there is truly a big opportunity to apply machine learning and deep learning techniques on these terabytes of massive data and deliver breakthroughs in many domains. Deep learning in computer vision, speech recognition, video processing, etc., have sped up advances in many applications from the domains of manufacturing, robotics, business intelligence, autonomous driving, precision medicine, and digital surveillance, to name a few. Traditional machine learning algorithms such as Support Vector Machine, Principal Components Analyses, Alternate Least Squares, K-Means, and Decision Trees are ever present in product recommendations for online users, fraud detection, and financial services. There is a race to design parallel architectures to innovate, cover end-to-end workflows with low time to train while hitting state-of-the-art or higher accuracies without overfitting, low latency inferencing etc., all the while having good TCO, perf/watt and compute and memory efficiencies. Architectural innovation in CPUs, GPUs, FPGAs, ASICs, memories, and on-chip interconnects are needed with utmost urgency by these neural network and mathematical algorithms to attain their latency and accuracy requirements. Mixed and/or low precision arithmetic, high bandwidth stacked DRAMs, systolic array processing, vector extensions in many cores and multi-cores, special neural network instructions and sparse and dense data structures are some of the ways in which GEMM operations, Winograd convolutions, RELUs, fully connected layers etc., are optimally run to achieve expected accuracies and training and inference requirements.
This workshop aims to bring computer architecture, compiler, AI and machine learning/deep learning researchers as well as domain experts together, to produce research that target the confluence of these disciplines. It will be a venue for discussion and brainstorming of the topics related to these areas.
The topics of interest include, but are not limited to:
Novel CPU, GPU, FPGA and ASIC architectures for AI and deep learning
Compiler and runtime system design for AI and data science
Evolution of demands of ML frameworks and workloads
Low precision and/or mixed precision arithmetic
Memory and storage technologies for AI (3DXpoint, HBM, Stacked DRAM, and NVRAM)
Neural network instructions and special purpose units
AI workload design and development for accelerators
Hardware/software codesign with intelligent systems
End to end flow system optimizations
09月10日
2017
会议日期
初稿截稿日期
初稿录用通知日期
终稿截稿日期
注册截止日期
留言