Tag: Google Cloud Platform
-
Distributed Cloud Computing for Machine Learning
The goal of this project is to understand how modern deep learning frameworks scale training across multiple devices. Rather than treating distributed training as a black box, this project focuses on the underlying principles that make large-scale learning possible, such as parallelism, communication, and synchronization. To explore these ideas in practice, I implement a custom…