Follow
Bowen Shi
Bowen Shi
Facebook AI Research
Verified email at meta.com
Title
Cited by
Cited by
Year
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
3032022
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
Journal of Machine Learning Research 25 (97), 1-52, 2024
2552024
Voicebox: Text-guided multilingual universal speech generation at scale
M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ...
Advances in neural information processing systems 36, 2024
2142024
Scaling autoregressive multi-modal models: Pretraining and instruction tuning
L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ...
arXiv preprint arXiv:2309.02591 2 (3), 2023
1222023
Robust self-supervised audio-visual speech recognition
B Shi, WN Hsu, A Mohamed
arXiv preprint arXiv:2201.01763, 2022
1202022
Comparative layer-wise analysis of self-supervised speech models
A Pasad, B Shi, K Livescu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
1002023
American sign language fingerspelling recognition in the wild
B Shi, AM Del Rio, J Keane, J Michaux, D Brentari, G Shakhnarovich, ...
2018 IEEE Spoken Language Technology Workshop (SLT), 145-152, 2018
922018
Offloading guidelines for augmented reality applications on wearable devices
B Shi, J Yang, Z Huang, P Hui
Proceedings of the 23rd ACM international conference on Multimedia, 1271-1274, 2015
892015
Fingerspelling recognition in the wild with iterative visual attention
B Shi, AMD Rio, J Keane, D Brentari, G Shakhnarovich, K Livescu
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
842019
Few-shot acoustic event detection via meta learning
B Shi, M Sun, KC Puvvada, CC Kao, S Matsoukas, C Wang
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
722020
Audiobox: Unified audio generation with natural language prompts
A Vyas, B Shi, M Le, A Tjandra, YC Wu, B Guo, J Zhang, X Zhang, ...
arXiv preprint arXiv:2312.15821, 2023
712023
Open-domain sign language translation learned from online video
B Shi, D Brentari, G Shakhnarovich, K Livescu
arXiv preprint arXiv:2205.12870, 2022
442022
A cross-task analysis of text span representations
S Toshniwal, H Shi, B Shi, L Gao, K Livescu, K Gimpel
arXiv preprint arXiv:2006.03866, 2020
432020
Expresso: A benchmark and analysis of discrete expressive speech resynthesis
TA Nguyen, WN Hsu, A d'Avirro, B Shi, I Gat, M Fazel-Zarani, T Remez, ...
arXiv preprint arXiv:2308.05725, 2023
402023
u-hubert: Unified mixed-modal speech pretraining and zero-shot transfer to unlabeled modality
WN Hsu, B Shi
Advances in Neural Information Processing Systems 35, 21157-21170, 2022
342022
Fingerspelling detection in american sign language
B Shi, D Brentari, G Shakhnarovich, K Livescu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
322021
Movie gen: A cast of media foundation models
A Polyak, A Zohar, A Brown, A Tjandra, A Sinha, A Lee, A Vyas, B Shi, ...
arXiv preprint arXiv:2410.13720, 2024
312024
Muavic: A multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation
M Anwar, B Shi, V Goswami, WN Hsu, J Pino, C Wang
arXiv preprint arXiv:2303.00628, 2023
302023
Semi-supervised acoustic event detection based on tri-training
B Shi, M Sun, CC Kao, V Rozgic, S Matsoukas, C Wang
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
242019
Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition
B Shi, K Livescu
2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017
222017
The system can't perform the operation now. Try again later.
Articles 1–20