Accelerating sparse dnn models without hardware-support via tile-wise sparsity C Guo, BY Hsueh, J Leng, Y Qiu, Y Guan, Z Wang, X Jia, X Li, M Guo, ...
SC20: International Conference for High Performance Computing, Networking …, 2020
74 2020 Transkimmer: Transformer Learns to Layer-wise Skim Y Guan, Z Li, J Leng, Z Lin, M Guo
Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022
28 2022 Block-skim: Efficient question answering for transformer Y Guan, Z Li, Z Lin, Y Zhu, J Leng, M Guo
Proceedings of the AAAI Conference on Artificial Intelligence 36 (10), 10710 …, 2022
16 2022 How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT s Attention Y Guan, J Leng, C Li, Q Chen, M Guo
Proceedings of the 28th International Conference on Computational …, 2020
16 2020 Co-Design of Binary Processing in Memory ReRAM Array and DNN Model Optimization Algorithm Y Guan, T Ohsawa
IEICE Transactions on Electronics 103 (11), 685-692, 2020
5 2020 PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences S Zhang, W Cui, Q Chen, Z Zhang, Y Guan, J Leng, C Li, M Guo
Proceedings of the 36th ACM International Conference on Supercomputing, 1-12, 2022
4 2022 Co-Design of DNN Model Optimization for Binary ReRAM Array In-Memory Processing Y Guan, T Ohsawa
2019 IEEE 11th International Memory Workshop (IMW), 1-4, 2019
4 2019 Amanda: Unified Instrumentation Framework for Deep Neural Networks Y Guan, Y Qiu, J Leng, F Yang, S Yu, Y Liu, Y Feng, Y Zhu, L Zhou, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
2024 Accelerating Sparse DNNs Based on Tiled GEMM C Guo, F Xue, J Leng, Y Qiu, Y Guan, W Cui, Q Chen, M Guo
arXiv preprint arXiv:2402.10876, 2024
2024