Deep graph infomax P Veličković, W Fedus, WL Hamilton, P Liņ, Y Bengio, RD Hjelm arXiv preprint arXiv:1809.10341, 2018 | 1794 | 2018 |
Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311, 2022 | 1540 | 2022 |
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity W Fedus, B Zoph, N Shazeer The Journal of Machine Learning Research 23 (1), 5232-5270, 2022 | 905 | 2022 |
Emergent abilities of large language models J Wei, Y Tay, R Bommasani, C Raffel, B Zoph, S Borgeaud, D Yogatama, ... arXiv preprint arXiv:2206.07682, 2022 | 641 | 2022 |
MaskGAN: Better Text Generation via Filling in the ______ W Fedus, I Goodfellow, AM Dai International Conference on Learning Representations (ICLR 2018), 2018 | 555 | 2018 |
In silico labeling: Predicting fluorescent labels in unlabeled images SF Eric Christiansen, Samuel J. Yang, D. Michael Ando, Ashkan Javaherian ... Cell, 2018 | 524 | 2018 |
Scaling instruction-finetuned language models HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, E Li, X Wang, ... arXiv preprint arXiv:2210.11416, 2022 | 510 | 2022 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 331 | 2022 |
Revisiting resnets: Improved training and scaling strategies I Bello, W Fedus, X Du, ED Cubuk, A Srinivas, TY Lin, J Shlens, B Zoph Advances in Neural Information Processing Systems 34, 22614-22627, 2021 | 245 | 2021 |
Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step W Fedus, M Rosca, B Lakshminarayanan, AM Dai, S Mohamed, ... International Conference on Learning Representations (ICLR 2018), 2017 | 240 | 2017 |
The case for a directional dark matter detector and the status of current experimental efforts S Ahlen, N Afshordi, JBR Battat, J Billard, N Bozorgnia, S Burgos, ... International Journal of Modern Physics A 25 (01), 1-51, 2010 | 238 | 2010 |
Language GANs Falling Short M Caccia, L Caccia, W Fedus, H Larochelle, J Pineau, L Charlin International Conference on Learning Representations (ICLR 2020), 2018 | 200 | 2018 |
Revisiting fundamentals of experience replay W Fedus, P Ramachandran, R Agarwal, Y Bengio, H Larochelle, ... International Conference on Machine Learning, 3061-3071, 2020 | 187 | 2020 |
Glam: Efficient scaling of language models with mixture-of-experts N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ... International Conference on Machine Learning, 5547-5569, 2022 | 176 | 2022 |
ChatGPT: Optimizing language models for dialogue J Schulman, B Zoph, C Kim, J Hilton, J Menick, J Weng, JFC Uribe, ... OpenAI blog, 2022 | 154 | 2022 |
First dark matter search results from a surface run of the 10-L DMTPC directional dark matter detector S Ahlen, JBR Battat, T Caldwell, C Deaconu, D Dujmic, W Fedus, P Fisher, ... Physics Letters B 695 (1-4), 124-129, 2011 | 105 | 2011 |
Hyperbolic discounting and learning over multiple horizons W Fedus, C Gelada, Y Bengio, MG Bellemare, H Larochelle Reinforcement Learning and Decision Making (RLDM 2019), 2019 | 93 | 2019 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 87* | 2021 |
On bonus-based exploration methods in the arcade learning environment AA Taiga, W Fedus, MC Machado, A Courville, MG Bellemare arXiv preprint arXiv:2109.11052, 2021 | 76* | 2021 |
Recall Traces: Backtracking Models for Efficient Reinforcement Learning A Goyal, P Brakel, W Fedus, T Lillicrap, S Levine, H Larochelle, Y Bengio International Conference on Learning Representations (ICLR 2019), 2018 | 64 | 2018 |