Almost Surely Safe Alignment of Large Language Models at Inference-Time X Ji, SS Ramesh, M Zimmer, I Bogunovic, J Wang, HB Ammar arXiv preprint arXiv:2502.01208, 2025 | | 2025 |
Mixture of Attentions For Speculative Decoding M Zimmer, M Gritta, G Lampouras, HB Ammar, J Wang arXiv preprint arXiv:2410.03804, 2024 | | 2024 |
Ros-llm: A ros framework for embodied ai with task feedback and structured reasoning CE Mower, Y Wan, H Yu, A Grosnit, J Gonzalez-Billandon, M Zimmer, ... arXiv preprint arXiv:2406.19741, 2024 | 8 | 2024 |
A survey on interpretable reinforcement learning C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao, W Liu Machine Learning, 1-44, 2024 | 107 | 2024 |
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control Z Xiong, R Vuorio, J Beck, M Zimmer, K Shao, S Whiteson arXiv preprint arXiv:2402.06570, 2024 | 1 | 2024 |
Pangu-agent: A fine-tunable generalist agent with structured reasoning F Christianos, G Papoudakis, M Zimmer, T Coste, Z Wu, J Chen, ... arXiv preprint arXiv:2312.14878, 2023 | 17 | 2023 |
End-to-end meta-bayesian optimisation with transformer neural processes A Maraval, M Zimmer, A Grosnit, H Bou Ammar Advances in Neural Information Processing Systems 36, 11246-11260, 2023 | 17 | 2023 |
Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis PJ Gorinski, M Zimmer, G Lampouras, DGX Deik, I Iacobacci arXiv preprint arXiv:2310.13669, 2023 | 2 | 2023 |
Lightweight Structural Choices Operator for Technology Mapping A Grosnit, M Zimmer, R Tutunov, X Li, L Chen, F Yang, M Yuan, ... 2023 60th ACM/IEEE Design Automation Conference (DAC), 1-6, 2023 | 2 | 2023 |
Neuro-symbolic hierarchical rule induction C Glanois, Z Jiang, X Feng, P Weng, M Zimmer, D Li, W Liu, J Hao International Conference on Machine Learning, 7583-7615, 2022 | 33 | 2022 |
Sample-efficient optimisation with probabilistic transformer surrogates A Maraval, M Zimmer, A Grosnit, R Tutunov, J Wang, HB Ammar arXiv preprint arXiv:2205.13902, 2022 | 1 | 2022 |
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning C Glanois, M Zimmer, U Siddique, P Weng Proceedings of Machine Learning Research 139, 2021 | | 2021 |
Learning fair policies in decentralized cooperative multi-agent reinforcement learning M Zimmer, C Glanois, U Siddique, P Weng International Conference on Machine Learning, 12967-12978, 2021 | 69 | 2021 |
Hyperparameter auto-tuning in self-supervised robotic learning J Huang, J Rojas, M Zimmer, H Wu, Y Guan, P Weng IEEE Robotics and Automation Letters 6 (2), 3537-3544, 2021 | 11 | 2021 |
Differentiable logic machines M Zimmer, X Feng, C Glanois, Z Jiang, J Zhang, P Weng, D Li, J Hao, ... arXiv preprint arXiv:2102.11529, 2021 | 27 | 2021 |
Learning fair policies in multi-objective (deep) reinforcement learning with average and discounted rewards U Siddique, P Weng, M Zimmer International Conference on Machine Learning, 8905-8915, 2020 | 106 | 2020 |
Invariant transform experience replay: Data augmentation for deep reinforcement learning Y Lin, J Huang, M Zimmer, Y Guan, J Rojas, P Weng IEEE Robotics and Automation Letters 5 (4), 6615-6622, 2020 | 50 | 2020 |
Towards More Sample Efficiency inReinforcement Learning with Data Augmentation Y Lin, J Huang, M Zimmer, J Rojas, P Weng Robot Learning Workshop, NeurIPS 2019, 2019 | 5 | 2019 |
An efficient reinforcement learning algorithm for learning deterministic policies in continuous domains M Zimmer, P Weng Proceedings of the First International Conference on Distributed Artificial …, 2019 | | 2019 |
Exploiting the sign of the advantage function to learn deterministic policies in continuous domains M Zimmer, P Weng International Joint Conference on Artificial Intelligence, 2019 | 12 | 2019 |