Mohammad Ghavamzadeh
Mohammad Ghavamzadeh
Adresse e-mail validée de google.com - Page d'accueil
Titre
Citée par
Citée par
Année
Natural actor–critic algorithms
S Bhatnagar, RS Sutton, M Ghavamzadeh, M Lee
Automatica 45 (11), 2471-2482, 2009
4822009
Bayesian reinforcement learning: A survey
M Ghavamzadeh, S Mannor, J Pineau, A Tamar
arXiv preprint arXiv:1609.04436, 2016
2442016
Best arm identification: A unified approach to fixed budget and fixed confidence
V Gabillon, M Ghavamzadeh, A Lazaric
NIPS-Twenty-Sixth Annual Conference on Neural Information Processing Systems, 2012
2042012
Incremental natural actor-critic algorithms
S Bhatnagar, M Ghavamzadeh, M Lee, RS Sutton
Advances in neural information processing systems 20, 105-112, 2007
1742007
High-confidence off-policy evaluation
P Thomas, G Theocharous, M Ghavamzadeh
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
1662015
Regularized policy iteration.
AM Farahmand, M Ghavamzadeh, C Szepesvári, S Mannor
nips, 441-448, 2008
1582008
Hierarchical multi-agent reinforcement learning
R Makar, S Mahadevan, M Ghavamzadeh
Proceedings of the fifth international conference on Autonomous agents, 246-253, 2001
1542001
Supervised actor-critic reinforcement learning
MT Rosenstein, AG Barto, J Si, A Barto, W Powell
Learning and Approximate Dynamic Programming: Scaling Up to the Real World …, 2004
1502004
A lyapunov-based approach to safe reinforcement learning
Y Chow, O Nachum, E Duenez-Guzman, M Ghavamzadeh
arXiv preprint arXiv:1805.07708, 2018
1482018
Hierarchical multi-agent reinforcement learning
M Ghavamzadeh, S Mahadevan, R Makar
Autonomous Agents and Multi-Agent Systems 13 (2), 197-229, 2006
1432006
Risk-constrained reinforcement learning with percentile risk criteria
Y Chow, M Ghavamzadeh, L Janson, M Pavone
The Journal of Machine Learning Research 18 (1), 6070-6120, 2017
1272017
High confidence policy improvement
P Thomas, G Theocharous, M Ghavamzadeh
International Conference on Machine Learning, 2380-2388, 2015
1242015
Finite-Sample Analysis of Proximal Gradient TD Algorithms.
B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik
UAI, 504-513, 2015
1152015
Bayesian multi-task reinforcement learning
A Lazaric, M Ghavamzadeh
ICML-27th International Conference on Machine Learning, 599-606, 2010
1042010
More robust doubly robust off-policy evaluation
M Farajtabar, Y Chow, M Ghavamzadeh
International Conference on Machine Learning, 1447-1456, 2018
1002018
Ad recommendation systems for life-time value optimization
G Theocharous, PS Thomas, M Ghavamzadeh
Proceedings of the 24th International Conference on World Wide Web, 1305-1310, 2015
982015
Speedy Q-learning
MG Azar, R Munos, M Ghavamzadaeh, HJ Kappen
Spain, Granada: NIPS, 2011
982011
Multi-bandit best arm identification
V Gabillon, M Ghavamzadeh, A Lazaric, S Bubeck
932011
Algorithms for CVaR optimization in MDPs
Y Chow, M Ghavamzadeh
arXiv preprint arXiv:1406.3339, 2014
912014
Finite-sample analysis of least-squares policy iteration
A Lazaric, M Ghavamzadeh, R Munos
Journal of Machine Learning Research 13, 3041-3074, 2012
892012
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20