Ian Osband

Cited by

	All	Since 2019
Citations	8078	7235
h-index	26	25
i10-index	31	30

1600

800

400

1200

201520162017201820192020202120222023202427 74 223 474 755 1148 1366 1485 1521 958

Co-authors

Benjamin Van RoyStanford UniversityVerified email at stanford.edu
Zheng WenGoogle DeepMindVerified email at google.com
Vikranth DwaracherlaDeepMindVerified email at google.com
Xiuyuan LuGoogle DeepMindVerified email at google.com
Daniel RussoColumbia UniversityVerified email at gsb.columbia.edu
Alexander PritzelDeepmindVerified email at google.com
Morteza IbrahimiStanford UniversityVerified email at stanford.edu
Brendan O'DonoghueStanford University, Google DeepMindVerified email at alumni.stanford.edu
Mohammad Gheshlaghi AzarCohereVerified email at google.com
Todd HesterWaymoVerified email at waymo.com
Bilal PiotGoogle DeepmindVerified email at google.com
Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Verified email at univ-lille.fr
Tom SchaulSenior Staff Scientist, DeepMindVerified email at nyu.edu
Rémi MunosGoogle DeepMindVerified email at inria.fr
Marc LanctotResearch Scientist, Google DeepMindVerified email at google.com

Ian Osband

OpenAI

Verified email at openai.com - Homepage

Reinforcement Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1468	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1227	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1133	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	825	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in Neural Information Processing Systems 31, 2018	416	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	332	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	331	2016
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	269	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	220	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	195	2014
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	180	2019
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	179	2017
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	166*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	140	2012
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	121	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	113	2016
Bootstrapped thompson sampling and deep exploration I Osband, B Van Roy arXiv preprint arXiv:1507.00300, 2015	101	2015
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	101	2013
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2024	90	2024
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	90	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors