Use this url to cite publication: https://hdl.handle.net/20.500.12259/59509
Publication type
type::text::periodical::journal::contribution to journal::journal article::research article
Type of publication (PDB)
Straipsnis konferencijos medžiagoje Clarivate Analytics Web of Science ar/ir Scopus / Article in Clarivate Analytics Web of Science or Scopus DB conference proceedings (P1a)
Title
Hybrid approach for automatic identification of Multi-Word Expressions in Lithuanian
Is part of
Human language technologies - the Baltic perspective : proceedings of the 7th international conference, Baltic HLT 2016, Riga / editors Inguna Skadiņa, Roberts Rozis. Amsterdam : IOS Press, 2016
Extent
p. 153-159
Publisher
Amsterdam : IOS Press, 2016
Publisher (trusted)
IOS Press
Date Issued
2016
Series/Report no.
(Frontiers in artificial intelligence and applications, Vol. 289 0922-6389)
Description
Knygos ISBN 978-1-61499-701-6 (online)
ISBN (of the container)
9781614997009
ISSN (of the container)
0922-6389
DOI
https://doi.org/10.3233/978-1-61499-701-6-153
WOS
WOS:000390307700020
Other Identifier(s)
VDU02-000020029
Abstract
Identification of MultiWord Expressions (MWE) is one of the most challenging problems in Computer Linguistic and Natural Language Processing. A number of techniques are used to solve this problem in different language, mostly English. However not all techniques and approaches can be directly transferred to Lithuanian. Hence, in this paper we experiment with automatic identification of bi-gram MWEs for Lithuanian, which is considered to be under-resourced in terms of lexical resources and availability or accuracy of special lexical tools (e.g., POS-taggers, parsers). We use a raw corpus and combination of lexical association measures and supervised machine learning, which was shown to perform well for English and some other languages. Using this approach we have reached 70.4% precision for identification of typical MWEs, 77.1% precision for non-typical MWEs as well as 60.0% and 81.6% precision for typical adjective + noun and noun + noun MWEs respectively.
Bibliographic Details
17
Coverage Spatial
NL
Language
Anglų / English (en)