Skip to content
Naslovnica

Language models on a diet: cost-efficient development of encoders for closely-related languages via additional pretraining

Nikola Ljubešić, Vít Suchomel, Peter Rupnik, Taja Kuzman, Rik van Noord, 2024

naslovnica

Gos 2: a new reference corpus of spoken Slovenian

Darinka Verdonik, Kaja Dobrovoljc, Tomaž Erjavec, Nikola Ljubešić, 2024

naslovnica

Do Language Models Care about Text Quality? Evaluating Web-Crawled Corpora across 11 Languages

Rik van Noord, Taja Kuzman, Peter Rupnik, Nikola Ljubešić, Miquel Esplà-Gomis, Gema Ramírez-Sánchez, Antonio Toral, 2024

Geographic Adaptation of Pretrained Language Models

Valentin Hofmann, Goran Glavaš, Nikola Ljubešić, Janet B. Pierrehumbert, Hinrich Schütze, 2024

Naslovnica

Can cross-domain term extraction benefit from cross-lingual transfer and nested term labeling?

Hanh Thi Hong Tran, Matej Martinc, Andraz Repar, Nikola Ljubešić, Antoine Doucet in Senja Pollak, 2024