Heterogeneous system coherence for integrated CPU-GPU systems J Power, A Basu, J Gu, S Puthoor, BM Beckmann, MD Hill, SK Reinhardt, ... Proceedings of the 46th Annual IEEE/ACM International Symposium on …, 2013 | 213 | 2013 |
Lost in abstraction: Pitfalls of analyzing GPUs at the intermediate language level A Gutierrez, BM Beckmann, A Dutu, J Gross, M LeBeane, J Kalamatianos, ... 2018 IEEE International Symposium on High Performance Computer Architecture …, 2018 | 100 | 2018 |
Adaptive GPU cache bypassing Y Tian, S Puthoor, JL Greathouse, BM Beckmann, DA Jiménez Proceedings of the 8th Workshop on General Purpose Processing using GPUS, 25-35, 2015 | 94 | 2015 |
Oversubscribed command queues in GPUs S Puthoor, X Tang, J Gross, BM Beckmann Proceedings of the 11th Workshop on General Purpose GPUs, 50-60, 2018 | 31 | 2018 |
Implementing directed acyclic graphs with the heterogeneous system architecture S Puthoor, AM Aji, S Che, M Daga, W Wu, BM Beckmann, G Rodgers Proceedings of the 9th Annual Workshop on General Purpose Processing using …, 2016 | 31 | 2016 |
A taxonomy of gpgpu performance scaling A Majumdar, G Wu, K Dev, JL Greathouse, I Paul, W Huang, ... 2015 IEEE International Symposium on Workload Characterization, 118-119, 2015 | 20 | 2015 |
Managing cache coherence using information in a page table A Basu, BM Beckmann, S Che, S Puthoor US Patent 10,019,377, 2018 | 15 | 2018 |
Optimizing GPU cache policies for MI workloads J Alsop, MD Sinclair, S Bharadwaj, A Dutu, A Gutierrez, O Kayiran, ... 2019 IEEE International Symposium on Workload Characterization (IISWC), 243-248, 2019 | 13 | 2019 |
Continuation analysis tasks for GPU task scheduling ST Tye, BL Sumner, BM Beckmann, S Puthoor US Patent 10,620,994, 2020 | 11 | 2020 |
Software assisted hardware cache coherence for heterogeneous processors A Basu, S Puthoor, S Che, BM Beckmann Proceedings of the Second International Symposium on Memory Systems, 279-288, 2016 | 11 | 2016 |
Conversation mirror/intercom SP Dykstra US Patent 6,690,803, 2004 | 11* | 2004 |
Predicting a context portion to move between a context buffer and registers based on context portions previously used by at least one other thread D Yudanov, S Blagodurov, A Basu, S Puthoor, JL Greathouse US Patent 10,019,283, 2018 | 10 | 2018 |
Hardware accelerated dynamic work creation on a graphics processing unit A Gutierrez, S Puthoor US Patent 10,963,299, 2021 | 9 | 2021 |
A Research Retrospective on AMD's Exascale Computing Journey GH Loh, MJ Schulte, M Ignatowski, V Adhinarayanan, S Aga, D Aguren, ... Proceedings of the 50th Annual International Symposium on Computer …, 2023 | 8 | 2023 |
Mechanism for mitigating information leak via cache side channels during speculative execution S Puthoor US Patent 11,231,931, 2022 | 7 | 2022 |
Compiler assisted coalescing S Puthoor, MH Lipasti Proceedings of the 27th International Conference on Parallel Architectures …, 2018 | 7 | 2018 |
A case for scoped persist barriers in gpus D Gope, A Basu, S Puthoor, M Meswani Proceedings of the 11th Workshop on General Purpose GPUs, 2-12, 2018 | 7 | 2018 |
AMD gem5 APU simulator: Modeling GPUs Using the Machine ISA A Gutierrez, BM Beckmann, S Puthoor, MD Sinclair, T Ta, X Zhang Tutorial at International Symposium on Computer Architecture, 2018 | 6 | 2018 |
Dynamic wavefront creation for processing units using a hybrid compactor S Puthoor, BM Beckmann, D Yudanov US Patent 9,898,287, 2018 | 6 | 2018 |
Optimizing hyperplane sweep operations using asynchronous multi-grain gpu tasks AM Kaushik, AM Aji, MA Hassaan, N Chalmers, N Wolfe, S Moe, ... 2019 IEEE International Symposium on Workload Characterization (IISWC), 59-69, 2019 | 5 | 2019 |