Pedro Ortiz Suarez
Pedro Ortiz Suarez
PhD Student, Inria, Sorbonne Université
Verified email at inria.fr - Homepage
Title
Cited by
Cited by
Year
CamemBERT: a Tasty French Language Model
L Martin, B Muller, PJ Ortiz Suárez, Y Dupont, L Romary, ...
Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020
3542020
Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures
PJ Ortiz Suárez, B Sagot, L Romary
7th Workshop on the Challenges in the Management of Large Corpora, 2019
120*2019
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
PJ Ortiz Suárez, L Romary, B Sagot
Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020
66*2020
Quality at a glance: An audit of web-crawled multilingual datasets
I Caswell, J Kreutzer, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ...
arXiv preprint arXiv:2103.12028, 2021
24*2021
Building a user-generated content north-african arabizi treebank: Tackling hell
D Seddah, F Essaidi, A Fethi, M Futeral, B Muller, PJ Ortiz Suárez, ...
Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020
182020
Establishing a New State-of-the-Art for French Named Entity Recognition
PJ Ortiz Suárez, Y Dupont, B Muller, L Romary, B Sagot
Proceedings of The 12th Language Resources and Evaluation Conference, 4631–4638, 2020
9*2020
Les modčles de langue contextuels Camembert pour le français: impact de la taille et de l'hétérogénéité des données d'entrainement
L Martin, B Muller, PJ Ortiz Suárez, Y Dupont, L Romary, E Clergerie, ...
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP …, 2020
42020
Ungoliant: An optimized pipeline for the generation of a very large-scale multilingual web corpus
J Abadji, PJ Ortiz Suárez, L Romary, B Sagot
Proceedings of the Workshop on Challenges in the Management of Large Corpora …, 2021
12021
SinNer@CLEF-HIPE2020: Sinful Adaptation of SotA models for Named Entity Recognition in Historical French and German Newspapers
PJ Ortiz Suárez, Y Dupont, G Lejeune, T Tian
CLEF 2020 Working Notes 2696, 2020
1*2020
How OCR Performance can Impact on the Automatic Extraction of Dictionary Content Structures
M Khemakhem, I Galleron, G Williams, L Romary, PJ Ortiz Suárez
12019
Towards a Cleaner Document-Oriented Multilingual Crawled Corpus
J Abadji, P Ortiz Suarez, L Romary, B Sagot
arXiv preprint arXiv:2201.06642, 2022
2022
Expanding the content model of annotationBlock
A Bartz, J Janes, L Romary, P Gambette, R Bawden, PJO Suárez, B Sagot, ...
Next Gen TEI, 2021-TEI Conference and Members’ Meeting, 2021
2021
A dataset for automatic detection of places in (early) modern French texts
S Gabay, P Ortiz Suarez
NASSCFL 2021-50th Annual North American Society for Seventeenth-Century …, 2021
2021
French Contextualized Word-Embeddings with a sip of CaBeRnet: a New French Balanced Reference Corpus
M Popa-Fabre, PJ Ortiz Suárez, B Sagot, ÉV de la Clergerie
Proceedings of the 8th Workshop on Challenges in the Management of Large …, 2020
2020
Preparing the Dictionnaire Universel for Automatic Enrichment
PJ Ortiz Suárez, L Romary, B Sagot
10th International Conference on Historical Lexicography and Lexicology (ICHLL), 2019
2019
The system can't perform the operation now. Try again later.
Articles 1–15