Please use this identifier to cite or link to this item:https://hdl.handle.net/20.500.12259/36090
Type of publication: Straipsnis konferencijos medžiagoje kitose duomenų bazėse / Article in conference proceedings in other databases (P1c)
Field of Science: Informatika / Informatics (N009)
Author(s): Bumbulienė, Ieva;Mandravickaitė, Justina;Boizou, Loic;Krilavičius, Tomas
Title: An overview of Lithuanian internet media n-gram corpus
Is part of: CEUR workshop proceedings [electronic resource]: SYSTEM 2017: proceedings of the symposium for Young Scientists in Technology, Engineering and Mathematics, Kaunas, Lithuania, April 28, 2017. Aachen : CEUR-WS, 2017, Vol. 1853
Extent: p. 24-28
Date: 2017
Keywords: Internet media;Lithuanian Internet;N-gram corpus
Abstract: This paper describes construction and properties of the open 70 million words Lithuanian Internet media n-gram corpus. Due to copyright limitations often contemporary media based resources availability is restricted, while n-grams corpora (e.g., Google N-gram viewer/corpus) solve the problem. Lithuanian language is under-resourced, hence n-gram corpus of Lithuanian media is designed to contribute to publicly available ready-to-use lexical resources. In this paper we report corpus construction procedure, preprocessing, corpus statistics and possible areas of application
Internet: https://eltalpykla.vdu.lt/handle/1/36090
http://ceur-ws.org/Vol-1853/p05.pdf
Affiliation(s): Baltijos pažangių technologijų institutas, Vilnius
Baltijos pažangiųjų technologijų institutas
Informatikos fakultetas
Taikomosios informatikos katedra
Vilniaus universitetas
Vytauto Didžiojo universitetas
Appears in Collections:3. Konferencijų medžiaga / Conference materials
Universiteto mokslo publikacijos / University Research Publications

Files in This Item:
marc.xml8.11 kBXMLView/Open

MARC21 XML metadata

Show full item record
Export via OAI-PMH Interface in XML Formats
Export to Other Non-XML Formats

Page view(s)

224
checked on Mar 5, 2020

Download(s)

108
checked on Mar 5, 2020

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.