Blake Hechtman
Citée par
Citée par
Mesh-tensorflow: Deep learning for supercomputers
N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ...
Advances in neural information processing systems 31, 2018
Scaling local self-attention for parameter efficient visual backbones
A Vaswani, P Ramachandran, A Srinivas, N Parmar, B Hechtman, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
Heterogeneous-race-free memory models
DR Hower, BA Hechtman, BM Beckmann, BR Gaster, MD Hill, ...
Proceedings of the 19th international conference on Architectural support …, 2014
Scaling language models: Methods, analysis & insights from training gopher
JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ...
arXiv preprint arXiv:2112.11446, 2021
QuickRelease: A throughput-oriented approach to release consistency on GPUs
BA Hechtman, S Che, DR Hower, Y Tian, BM Beckmann, MD Hill, ...
2014 IEEE 20th International Symposium on High Performance Computer …, 2014
Evaluating cache coherent shared virtual memory for heterogeneous multicore chips
BA Hechtman, DJ Sorin
2013 IEEE International Symposium on Performance Analysis of Systems and …, 2013
GSPMD: general and scalable parallelization for ML computation graphs
Y Xu, HJ Lee, D Chen, B Hechtman, Y Huang, R Joshi, M Krikun, ...
arXiv preprint arXiv:2105.04663, 2021
Scale mlperf-0.6 models on google tpu-v3 pods
S Kumar, V Bitorff, D Chen, C Chou, B Hechtman, HJ Lee, N Kumar, ...
arXiv preprint arXiv:1909.09756, 2019
Large-scale discrete Fourier transform on TPUs
T Lu, YF Chen, B Hechtman, T Wang, J Anderson
IEEE Access 9, 93422-93432, 2021
Automatic cross-replica sharding of weight update in data-parallel training
Y Xu, HJ Lee, D Chen, H Choi, B Hechtman, S Wang
arXiv preprint arXiv:2004.13336, 2020
Exploring the limits of Concurrency in ML Training on Google TPUs
S Kumar, Y Wang, C Young, J Bradbury, N Kumar, D Chen, A Swing
Proceedings of Machine Learning and Systems 3, 81-92, 2021
Hierarchical write-combining cache coherence
BA Hechtman, BM Beckmann
US Patent 9,396,112, 2016
Method for memory consistency among heterogeneous computer components
DR Hower, MD Hill, D Wood, SK Reinhardt, BR Gaster, BA Hechtman, ...
US Patent 9,361,118, 2016
Data remapping for heterogeneous processor
S Che, B Beckmann, B Hechtman
US Patent App. 14/055,221, 2015
A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers
PM Phothilimthana, A Sabne, N Sarda, KS Murthy, Y Zhou, ...
2021 30th International Conference on Parallel Architectures and Compilation …, 2021
Mechanisms to save user/kernel copy for cross device communications
BA Hechtman, S Che
US Patent 9,436,395, 2016
TREES: A CPU/GPU task-parallel runtime with explicit epoch synchronization
BA Hechtman, AD Hilton, DJ Sorin
arXiv preprint arXiv:1608.00571, 2016
Sequential consistency for heterogeneous-race-free
DR Hower, BM Beckmann, BR Gaster, BA Hechtman, MD Hill, ...
Memory Systems Performance and Correctness (MSPC), 2013
The limits of concurrency in cache coherence
BA Hechtman, DJ Sorin
Proceedings of the Workshop on Duplicating, Deconstructing and Debunking …, 2012
Unified scaling laws for routed language models
A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ...
International Conference on Machine Learning, 4057-4086, 2022
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20