Big bird: Transformers for longer sequences M Zaheer, G Guruganesh, A Dubey, J Ainslie, C Alberti, S Ontanon, ... arXiv preprint arXiv:2007.14062, 2020 | 49 | 2020 |
Encoding long and structured data in transformers J Ainslie, S Ontanon, C Alberti, P Pham, A Ravula, S Sanghai arXiv preprint arXiv:2004.08483, 2020 | 8 | 2020 |
ETC: Encoding long and structured data in transformers J Ainslie, S Ontanon, C Alberti, P Pham, A Ravula, S Sanghai arXiv preprint arXiv:2004.08483, 2020 | 2 | 2020 |
ETC: Encoding Long and Structured Inputs in Transformers A Ravula, C Alberti, J Ainslie, L Yang, PM Pham, Q Wang, S Ontanon, ... | 2 | 2020 |
RealFormer: Transformer Likes Residual Attention R He, A Ravula, B Kanagal, J Ainslie arXiv e-prints, arXiv: 2012.11747, 2020 | 1 | 2020 |
ETC: Encoding Long and Structured Inputs in Transformers J Ainslie, S Ontanon, C Alberti, V Cvicek, Z Fisher, P Pham, A Ravula, ... Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020 | 1 | 2020 |
Informer: Transformer Likes Informed Attention R He, A Ravula, B Kanagal, J Ainslie arXiv preprint arXiv:2012.11747, 2020 | | 2020 |
Big Bird: Transformers for Longer Sequences Download PDF M Zaheer, G Guruganesh, A Dubey, J Ainslie, C Alberti, S Ontanon, ... | | |