Suivre
WEI SHEN
WEI SHEN
Adresse e-mail validée de m.fudan.edu.cn - Page d'accueil
Titre
Citée par
Citée par
Année
Secrets of RLHF in Large Language Models Part I: PPO
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Q Liu, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following (best …, 2023
53*2023
Secrets of rlhf in large language models part ii: Reward modeling
B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ...
arXiv preprint arXiv:2401.06080, 2024
14*2024
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment
S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ...
arXiv preprint arXiv:2312.09979, 2023
13*2023
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback
W Shen, R Zheng, W Zhan, J Zhao, S Dou, T Gui, Q Zhang, X Huang
The 2023 Conference on Empirical Methods in Natural Language Processing, 2023
92023
Human-instruction-free llm self-alignment with limited samples
H Guo, Y Yao, W Shen, J Wei, X Zhang, Z Wang, Y Liu
arXiv preprint arXiv:2401.06785, 2024
42024
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
R Zheng, W Shen, Y Hua, W Lai, S Dou, Y Zhou, Z Xi, X Wang, H Huang, ...
Twelfth International Conference on Learning Representations (ICLR 2024 …, 2023
22023
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
W Shen, X Zhang, Y Yao, R Zheng, H Guo, Y Liu
arXiv preprint arXiv:2403.07708, 2024
12024
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
X Zhang, JF Ton, W Shen, H Wang, Y Liu
arXiv preprint arXiv:2403.05171, 2024
12024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ...
arXiv preprint arXiv:2402.05808, 2024
12024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, W Shen, X Fan, ...
arXiv preprint arXiv:2402.01391, 2024
12024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
S Gao, Q Ge, W Shen, S Dou, J Ye, X Wang, R Zheng, Y Zou, Z Chen, ...
arXiv preprint arXiv:2401.11458, 2024
12024
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–11