Suivre
Yuanzhi Li
Yuanzhi Li
Assistant Professor at CMU
Adresse e-mail validée de andrew.cmu.edu - Page d'accueil
Titre
Citée par
Citée par
Année
Lora: Low-rank adaptation of large language models
EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen
arXiv preprint arXiv:2106.09685, 2021
43602021
Sparks of artificial general intelligence: Early experiments with gpt-4
S Bubeck, V Chandrasekaran, R Eldan, J Gehrke, E Horvitz, E Kamar, ...
arXiv preprint arXiv:2303.12712, 2023
22122023
A convergence theory for deep learning via over-parameterization
Z Allen-Zhu, Y Li, Z Song
International conference on machine learning, 242-252, 2019
15022019
Learning and generalization in overparameterized neural networks, going beyond two layers
Z Allen-Zhu, Y Li, Y Liang
Advances in neural information processing systems 32, 2019
8212019
Convergence analysis of two-layer neural networks with relu activation
Y Li, Y Yuan
Advances in neural information processing systems 30, 2017
7322017
Learning overparameterized neural networks via stochastic gradient descent on structured data
Y Li, Y Liang
Advances in neural information processing systems 31, 2018
6902018
A theoretical analysis of NDCG type ranking measures
Y Wang, L Wang, Y Li, D He, TY Liu
Conference on learning theory, 25-54, 2013
6772013
A latent variable model approach to pmi-based word embeddings
S Arora, Y Li, Y Liang, T Ma, A Risteski
Transactions of the Association for Computational Linguistics 4, 385-399, 2016
6552016
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning
Z Allen-Zhu, Y Li
arXiv preprint arXiv:2012.09816, 2020
3412020
Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations
Y Li, T Ma, H Zhang
Conference On Learning Theory, 2-47, 2018
3302018
An alternative view: When does SGD escape local minima?
B Kleinberg, Y Li, Y Yuan
International conference on machine learning, 2698-2707, 2018
3292018
Towards explaining the regularization effect of initial large learning rate in training neural networks
Y Li, C Wei, T Ma
Advances in neural information processing systems 32, 2019
3062019
Linear algebraic structure of word senses, with applications to polysemy
S Arora, Y Li, Y Liang, T Ma, A Risteski
Transactions of the Association for Computational Linguistics 6, 483-495, 2018
2512018
Algorithmic framework for model-based deep reinforcement learning with theoretical guarantees
Y Luo, H Xu, Y Li, Y Tian, T Darrell, T Ma
arXiv preprint arXiv:1807.03858, 2018
2412018
Textbooks are all you need
S Gunasekar, Y Zhang, J Aneja, CCT Mendes, A Del Giorno, S Gopi, ...
arXiv preprint arXiv:2306.11644, 2023
2192023
What can resnet learn efficiently, going beyond kernels?
Z Allen-Zhu, Y Li
Advances in Neural Information Processing Systems 32, 2019
2112019
Gradient descent on neural networks typically occurs at the edge of stability
J Cohen, S Kaur, Y Li, JZ Kolter, A Talwalkar
International Conference on Learning Representations, 2020
1972020
On the convergence rate of training recurrent neural networks
Z Allen-Zhu, Y Li, Z Song
Advances in neural information processing systems 32, 2019
1922019
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
S Chen, S Chewi, J Li, Y Li, A Salim, AR Zhang
arXiv preprint arXiv:2209.11215, 2022
1642022
Neon2: Finding local minima via first-order oracles
Z Allen-Zhu, Y Li
Advances in Neural Information Processing Systems 31, 2018
1512018
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20