Use this url to cite publication: https://hdl.handle.net/20.500.12259/34331
Language identification for Lithuanian, Russian and Azeri languages
Type of publication
Straipsnis recenzuojamoje Lietuvos konferencijos medžiagoje / Article in peer-reviewed Lithuanian conference proceedings (P1f)
Author(s)
Author | Affiliation | |||
---|---|---|---|---|
LT | ||||
Title [en]
Language identification for Lithuanian, Russian and Azeri languages
Is part of
Informacinės technologijos : 19-oji tarpuniversitetinė tarptautinė magistrantų ir doktorantų konferencija "Informacinė visuomenė ir universitetinės studijos" (IVUS 2014) : konferencijos pranešimų medžiaga. Kaunas : Technologija, 2014, 19
Date Issued
Date |
---|
2014 |
Publisher
Kaunas : Technologija
Extent
p. 167-171
Abstract (en)
Language identification is an important part of Natural Language Processing, because most of techniques are language sensitive, and therefore in multi-language systems language should be identified before further processing steps. Techniques and tools for more popular languages are well defined and are available in commercial and open source tools, but are not defined for less popular languages. In this work we investigate techniques for Lithuanian, Russian and Azeri (Azjerbaijani) languages. Corpora for these and similar languages (Latvian, Ukrainian, Belarusian, Turkish and Turkmen (Turkman) were collected and prepared. Selected approaches were trained. Results were evaluated using precision, recall and F-score.
Type of document
type::text::journal::journal article::research article
Language
Anglų / English (en)
Coverage Spatial
Lietuva / Lithuania (LT)
File(s)
ISSN (of the container)
2029-249X
Other Identifier(s)
VDU02-000017044
Access Rights
Atviroji prieiga / Open Access