Odalric-Ambrym Maillard
Odalric-Ambrym Maillard
Inria Lille - Nord Europe
Verified email at inria.fr - Homepage
Title
Cited by
Cited by
Year
Kullback–leibler upper confidence bounds for optimal sequential allocation
O Cappé, A Garivier, OA Maillard, R Munos, G Stoltz
Annals of Statistics 41 (3), 1516-1541, 2013
3032013
A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
OA Maillard, R Munos, G Stoltz
Proceedings of the 24th annual Conference On Learning Theory, 497-514, 2011
1242011
Compressed least-squares regression
OA Maillard, R Munos
1212009
Concentration inequalities for sampling without replacement
R Bardenet, OA Maillard
Bernoulli 21 (3), 1361-1385, 2015
1132015
LSTD with random projections
M Ghavamzadeh, A Lazaric, OA Maillard, R Munos
652010
Latent Bandits.
OA Maillard, S Mannor
International Conference on Machine Learning, 136-144, 2014
602014
Linear regression with random projections
O Maillard, R Munos
Journal of Machine Learning Research 13, 2735-2772, 2012
492012
Robust risk-averse stochastic multi-armed bandits
OA Maillard
International Conference on Algorithmic Learning Theory, 218-233, 2013
422013
Finite-sample analysis of Bellman residual minimization
OA Maillard, R Munos, A Lazaric, M Ghavamzadeh
Proceedings of 2nd Asian Conference on Machine Learning, 299-314, 2010
402010
The non-stationary stochastic multi-armed bandit problem
R Allesiardo, R Féraud, OA Maillard
International Journal of Data Science and Analytics 3 (4), 267-283, 2017
372017
Sub-sampling for multi-armed bandits
A Baransi, OA Maillard, S Mannor
Joint European Conference on Machine Learning and Knowledge Discovery in …, 2014
372014
How hard is my MDP?" The distribution-norm to the rescue"
OA Maillard, TA Mann, S Mannor
Advances in Neural Information Processing Systems 27, 1835-1843, 2014
372014
Selecting the state-representation in reinforcement learning
OA Maillard, R Munos, D Ryabko
arXiv preprint arXiv:1302.2552, 2013
362013
Adaptive Bandits: Towards the best history-dependent strategy
OA Maillard, R Munos
30*2011
Variance-aware regret bounds for undiscounted reinforcement learning in mdps
MS Talebi, OA Maillard
Algorithmic Learning Theory, 770-805, 2018
292018
Online learning in adversarial lipschitz environments
OA Maillard, R Munos
Joint european conference on machine learning and knowledge discovery in …, 2010
292010
Hybrid collaborative filtering with autoencoders
F Strub, J Mary, R Gaudel
arXiv preprint arXiv:1603.00806, 2016
282016
Optimal regret bounds for selecting the state representation in reinforcement learning
OA Maillard, P Nguyen, R Ortner, D Ryabko
International Conference on Machine Learning, 543-551, 2013
262013
Selecting near-optimal approximate state representations in reinforcement learning
R Ortner, OA Maillard, D Ryabko
International Conference on Algorithmic Learning Theory, 140-154, 2014
222014
Streaming kernel regression with provably adaptive mean, variance, and regularization
A Durand, OA Maillard, J Pineau
The Journal of Machine Learning Research 19 (1), 650-683, 2018
212018
The system can't perform the operation now. Try again later.
Articles 1–20