Zephyr: Direct distillation of lm alignment L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ... arXiv preprint arXiv:2310.16944, 2023 | 421 | 2023 |
Open llm leaderboard E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ... Hugging Face, 2023 | 258 | 2023 |
Trl: Transformer reinforcement learning L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert, ... GitHub. Available online at: https://github. com/lvwerra/trl, 2020 | 178 | 2020 |
The alignment handbook L Tunstall, E Beeching, N Lambert, N Rajani, S Huang, K Rasul, AM Rush, ... | 44 | 2023 |
Learning to plan with uncertain topological maps E Beeching, J Dibangoye, O Simonin, C Wolf European Conference on Computer Vision, 473-490, 2020 | 44 | 2020 |
Deep reinforcement learning on a budget: 3d control and reasoning without a supercomputer E Beeching, J Debangoye, O Simonin, C Wolf 2020 25th International Conference on Pattern Recognition (ICPR), 158-165, 2021 | 32 | 2021 |
Egomap: Projective mapping and structured egocentric memory for deep RL E Beeching, J Dibangoye, O Simonin, C Wolf Joint European conference on machine learning and knowledge discovery in …, 2020 | 26 | 2020 |
Creating a Coding Assistant with StarCoder. Hugging Face Blog (2023) L Tunstall, N Lambert, N Rajani, E Beeching, T Le Scao, L von Werra, ... | 19 | 2023 |
Stackllama: An rl fine-tuned llama model for stack exchange question and answering, 2023 E Beeching, Y Belkada, K Rasul, L Tunstall, L von Werra, N Rajani, ... URL https://huggingface. co/blog/stackllama 1 (4.1), 4.1, 2023 | 19 | 2023 |
Godot reinforcement learning agents E Beeching, J Debangoye, O Simonin, C Wolf arXiv preprint arXiv:2112.03636, 2021 | 17 | 2021 |
Creating a coding assistant with starcoder L Tunstall, N Lambert, N Rajani, E Beeching, T Le Scao, L von Werra, ... Hugging Face Blog, 2023 | 15 | 2023 |
Zephyr: Direct distillation of lm alignment, 2023 L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ... URL https://arxiv. org/abs/2310.16944 6, 2023 | 13 | 2023 |
Open llm leaderboard (2023-2024) E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ... | 12 | 2023 |
No robots N Rajani, L Tunstall, E Beeching, N Lambert, AM Rush, T Wolf | 12 | 2023 |
Trl: Transformer reinforcement learning (2020) L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert, ... URL https://github. com/huggingface/trl, 0 | 12 | |
Open LLM Leaderboard. 2023 E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ... URL https://huggingface. co/spaces/HuggingFaceH4/open_llm_leaderboard, 0 | 12 | |
Graph augmented deep reinforcement learning in the gamerland3d environment E Beeching, M Peter, P Marcotte, J Debangoye, O Simonin, J Romoff, ... arXiv preprint arXiv:2112.11731, 2021 | 9 | 2021 |
Can foundation models label data like humans? N Rajani, N Lambert, S Han, J Wang, O Nitski, E Beeching, L Tunstall Hugging Face Blog, 2023 | 7 | 2023 |
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Q Gallouédec, E Beeching, C Romac, E Dellandréa arXiv preprint arXiv:2402.09844, 2024 | 6 | 2024 |
Numinamath: The largest public dataset in ai4maths with 860k pairs of competition math problems and solutions J Li, E Beeching, L Tunstall, B Lipkin, R Soletskyi, S Huang, K Rasul, L Yu, ... Hugging Face repository, 2024 | 2 | 2024 |