Mikhail Smelyanskiy

Citée par

	Toutes	Depuis 2019
Citations	11808	8023
indice h	41	32
indice i10	92	68

2000

1000

500

1500

200920102011201220132014201520162017201820192020202120222023202451 90 178 221 278 434 474 473 545 722 990 1316 1526 1765 1915 503

Accès public

Tout afficher

13 articles

1 article

disponibles

non disponibles

Sur la base des exigences liées au financement

Suivre

Mikhail Smelyanskiy

Facebook

Adresse e-mail validée de intel.com - Page d'accueil

Deep learning HPC SW/HW co-design


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
On large-batch training for deep learning: Generalization gap and sharp minima NS Keskar, D Mudigere, J Nocedal, M Smelyanskiy, PTP Tang arXiv preprint arXiv:1609.04836, 2016	3288	2016
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU VW Lee, C Kim, J Chhugani, M Deisher, D Kim, AD Nguyen, N Satish, ... Proceedings of the 37th annual international symposium on Computer …, 2010	1196	2010
Applied machine learning at facebook: A datacenter infrastructure perspective K Hazelwood, S Bird, D Brooks, S Chintala, U Diril, D Dzhulgakov, ... 2018 IEEE International Symposium on High Performance Computer Architecture …, 2018	694	2018
Deep learning recommendation model for personalization and recommendation systems M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ... arXiv preprint arXiv:1906.00091, 2019	617	2019
Efficient sparse matrix-vector multiplication on x86-based many-core processors X Liu, M Smelyanskiy, E Chow, P Dubey Proceedings of the 27th international ACM conference on International …, 2013	329	2013
Glow: Graph lowering compiler techniques for neural networks N Rotem, J Fix, S Abdulrasool, G Catron, S Deng, R Dzhabarov, N Gibson, ... arXiv preprint arXiv:1805.00907, 2018	310	2018
A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019	293	2019
The architectural implications of facebook's dnn-based personalized recommendation U Gupta, CJ Wu, X Wang, M Naumov, B Reagen, D Brooks, B Cottel, ... 2020 IEEE International Symposium on High Performance Computer Architecture …, 2020	270	2020
Design and implementation of the linpack benchmark for single and multi-node systems based on intel® xeon phi coprocessor A Heinecke, K Vaidyanathan, M Smelyanskiy, A Kobotov, R Dubtsov, ... 2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013	215	2013
Exploring simd for molecular dynamics, using intel® xeon® processors and intel® xeon phi coprocessors SJ Pennycook, CJ Hughes, M Smelyanskiy, SA Jarvis 2013 IEEE 27th International symposium on parallel and distributed …, 2013	210	2013
Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ... arXiv preprint arXiv:1811.09886, 2018	199	2018
Recnmp: Accelerating personalized recommendation with near-memory processing L Ke, U Gupta, BY Cho, D Brooks, V Chandra, U Diril, A Firoozshahian, ... 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020	187	2020
Petascale high order dynamic rupture earthquake simulations on heterogeneous supercomputers A Heinecke, A Breuer, S Rettenberger, M Bader, AA Gabriel, C Pelties, ... SC'14: Proceedings of the International Conference for High Performance …, 2014	167	2014
Anatomy of high-performance many-threaded matrix multiplication TM Smith, R Van De Geijn, M Smelyanskiy, JR Hammond, FG Van Zee 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014	163	2014
qHiPSTER: The quantum high performance software testing environment M Smelyanskiy, NPD Sawaya, A Aspuru-Guzik arXiv preprint arXiv:1601.07195, 2016	152	2016
Convergence of recognition, mining, and synthesis workloads and its implications YK Chen, J Chhugani, P Dubey, CJ Hughes, D Kim, S Kumar, VW Lee, ... Proceedings of the IEEE 96 (5), 790-807, 2008	149	2008
Practical optimization for hybrid quantum-classical algorithms GG Guerreschi, M Smelyanskiy arXiv preprint arXiv:1701.01450, 2017	148	2017
Can traditional programming bridge the ninja performance gap for parallel computing applications? N Satish, C Kim, J Chhugani, H Saito, R Krishnaiyer, M Smelyanskiy, ... ACM SIGARCH Computer Architecture News 40 (3), 440-451, 2012	146	2012
The BLIS framework: Experiments in portability FG Van Zee, TM Smith, B Marker, TM Low, RAVD Geijn, FD Igual, ... ACM Transactions on Mathematical Software (TOMS) 42 (2), 1-19, 2016	126	2016
Mapping high-fidelity volume rendering for medical imaging to CPU, GPU and many-core architectures M Smelyanskiy, D Holmes, J Chhugani, A Larson, DM Carmean, ... IEEE transactions on visualization and computer graphics 15 (6), 1563-1570, 2009	112	2009

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–20

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par