Suivre
Blake Hechtman
Titre
Citée par
Citée par
Année
Mesh-tensorflow: Deep learning for supercomputers
N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ...
Advances in neural information processing systems 31, 2018
2332018
Scaling local self-attention for parameter efficient visual backbones
A Vaswani, P Ramachandran, A Srinivas, N Parmar, B Hechtman, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
1192021
Heterogeneous-race-free memory models
DR Hower, BA Hechtman, BM Beckmann, BR Gaster, MD Hill, ...
Proceedings of the 19th international conference on Architectural support …, 2014
1072014
Scaling language models: Methods, analysis & insights from training gopher
JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ...
arXiv preprint arXiv:2112.11446, 2021
1002021
QuickRelease: A throughput-oriented approach to release consistency on GPUs
BA Hechtman, S Che, DR Hower, Y Tian, BM Beckmann, MD Hill, ...
2014 IEEE 20th International Symposium on High Performance Computer …, 2014
772014
Evaluating cache coherent shared virtual memory for heterogeneous multicore chips
BA Hechtman, DJ Sorin
2013 IEEE International Symposium on Performance Analysis of Systems and …, 2013
282013
GSPMD: general and scalable parallelization for ML computation graphs
Y Xu, HJ Lee, D Chen, B Hechtman, Y Huang, R Joshi, M Krikun, ...
arXiv preprint arXiv:2105.04663, 2021
252021
Scale mlperf-0.6 models on google tpu-v3 pods
S Kumar, V Bitorff, D Chen, C Chou, B Hechtman, HJ Lee, N Kumar, ...
arXiv preprint arXiv:1909.09756, 2019
242019
Large-scale discrete Fourier transform on TPUs
T Lu, YF Chen, B Hechtman, T Wang, J Anderson
IEEE Access 9, 93422-93432, 2021
192021
Automatic cross-replica sharding of weight update in data-parallel training
Y Xu, HJ Lee, D Chen, H Choi, B Hechtman, S Wang
arXiv preprint arXiv:2004.13336, 2020
152020
Exploring the limits of Concurrency in ML Training on Google TPUs
S Kumar, Y Wang, C Young, J Bradbury, N Kumar, D Chen, A Swing
Proceedings of Machine Learning and Systems 3, 81-92, 2021
82021
Hierarchical write-combining cache coherence
BA Hechtman, BM Beckmann
US Patent 9,396,112, 2016
82016
Method for memory consistency among heterogeneous computer components
DR Hower, MD Hill, D Wood, SK Reinhardt, BR Gaster, BA Hechtman, ...
US Patent 9,361,118, 2016
82016
Data remapping for heterogeneous processor
S Che, B Beckmann, B Hechtman
US Patent App. 14/055,221, 2015
52015
A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers
PM Phothilimthana, A Sabne, N Sarda, KS Murthy, Y Zhou, ...
2021 30th International Conference on Parallel Architectures and Compilation …, 2021
32021
Mechanisms to save user/kernel copy for cross device communications
BA Hechtman, S Che
US Patent 9,436,395, 2016
32016
TREES: A CPU/GPU task-parallel runtime with explicit epoch synchronization
BA Hechtman, AD Hilton, DJ Sorin
arXiv preprint arXiv:1608.00571, 2016
32016
Sequential consistency for heterogeneous-race-free
DR Hower, BM Beckmann, BR Gaster, BA Hechtman, MD Hill, ...
Memory Systems Performance and Correctness (MSPC), 2013
32013
The limits of concurrency in cache coherence
BA Hechtman, DJ Sorin
Proceedings of the Workshop on Duplicating, Deconstructing and Debunking …, 2012
22012
Unified scaling laws for routed language models
A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ...
International Conference on Machine Learning, 4057-4086, 2022
12022
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20