Marc Lanctot
Marc Lanctot
Research Scientist, Google DeepMind
Verified email at google.com - Homepage
TitleCited byYear
Mastering the game of Go with deep neural networks and tree search
D Silver, A Huang, CJ Maddison, A Guez, L Sifre, G Van Den Driessche, ...
Nature 529 (7587), 484-489, 2016
54522016
Dueling Network Architectures for Deep Reinforcement Learning
Z Wang, T Schaul, M Hessel, H van Hasselt, M Lanctot, N de Freitas
arXiv preprint arXiv:1511.06581, 2016
6692016
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
D Silver, T Hubert, J Schrittwieser, I Antonoglou, M Lai, A Guez, M Lanctot, ...
Science 362 (6419), 1140-1144, 2018
568*2018
Deep Q-learning from Demonstrations
T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ...
Association for the Advancement of Artificial Intelligence (AAAI), 2018
1962018
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
JZ Leibo, V Zambaldi, M Lanctot, J Marecki, T Graepel
AAMAS, 2017
1612017
Monte Carlo sampling for regret minimization in extensive games
M Lanctot, K Waugh, M Zinkevich, M Bowling
Advances in Neural Information Processing Systems, 1078-1086, 2009
1472009
A unified game-theoretic approach to multiagent reinforcement learning
M Lanctot, V Zambaldi, A Gruslys, A Lazaridou, K Tuyls, J Pérolat, D Silver, ...
arXiv preprint arXiv:1711.00832, 2017
882017
Adversarial planning through strategy simulation
F Sailer, M Buro, M Lanctot
2007 IEEE Symposium on Computational Intelligence and Games, 80-87, 2007
842007
Real-Time Monte-Carlo Tree Search in Ms Pac-Man
T Pepels, MHM Winands, M Lanctot
Transactions on Computation Intelligence and AI in Games, 2014
642014
Memory-efficient backpropagation through time
A Gruslys, R Munos, I Danihelka, M Lanctot, A Graves
Advances In Neural Information Processing Systems, 4125-4133, 2016
612016
Fictitious Self-Play in Extensive-Form Games
J Heinrich, M Lanctot, D Silver
International Conference on Machine Learning, 2015
582015
No-Regret Learning in Extensive-Form Games with Imperfect Recall
M Lanctot, R Gibson, N Burch, M Zinkevich, M Bowling
International Conference on Machine Learning, 2012
582012
Value-Decomposition Networks For Cooperative Multi-Agent Learning
P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ...
arXiv preprint arXiv:1706.05296, 2017
57*2017
Convolution by evolution: Differentiable pattern producing networks
C Fernando, D Banarse, M Reynolds, F Besse, D Pfau, M Jaderberg, ...
Proceedings of the Genetic and Evolutionary Computation Conference 2016, 109-116, 2016
572016
Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization
M Johanson, N Bard, M Lanctot, R Gibson, M Bowling
Proceedings of the 11th International Conference on Autonomous Agents and …, 2012
482012
Computing approximate Nash equilibria and robust best-responses using sampling
M Ponsen, S De Jong, M Lanctot
Journal of Artificial Intelligence Research, 575-605, 2011
362011
Monte Carlo tree search with heuristic evaluations using implicit minimax backups
M Lanctot, MHM Winands, T Pepels, NR Sturtevant
Computational Intelligence and Games (CIG), 2014 IEEE Conference on, 1-8, 2014
312014
Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games
V Lisý, M Lanctot, M Bowling
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2015
23*2015
Monte Carlo Sampling and Regret Minimization For Equilibrium Computation and Decision-Making in Large Extensive Form Games
M Lanctot
University of Alberta, Edmonton, Canada, 2013
222013
Efficient monte carlo counterfactual regret minimization in games with many player actions
N Burch, M Lanctot, D Szafron, RG Gibson
Advances in Neural Information Processing Systems, 1880-1888, 2012
222012
The system can't perform the operation now. Try again later.
Articles 1–20