Please use this identifier to cite or link to this item:https://hdl.handle.net/20.500.12259/50111
Type of publication: Straipsnis konferencijos medžiagoje Clarivate Analytics Web of Science ar/ir Scopus / Article in Clarivate Analytics Web of Science or Scopus DB conference proceedings (P1a)
Field of Science: Informatika / Informatics (N009)
Author(s): Daudaravičius, Vidas
Title: The influence of collocation segmentation and top 10 items to keyword assignment performance
Is part of: Computational linguistics and intelligent text processing : 11th international conference, CICLing 2010, Iasi, Romania, March 21-27, 2010 : proceedings / editor Gelbukh A. Berlin : Springer, 2010
Extent: p. 648-660
Date: 2010
Series/Report no.: (Lecture Notes in Computer Science. Vol. 6008 0302-9743)
Keywords: Collocation segmentation;Top 10 items;Multilinguality;Keyword assignment;Stop-word list
ISBN: 9783642121159
Abstract: Automatic document annotation from a controlled conceptual thesaurus is useful for establishing precise links between similar documents. This study presents a language independent document annotation system based on features derived from a novel collocation segmentation method. Using the multilingual conceptual thesaurus EuroVoc, we evaluate filtered and unfiltered version of the method, comparing it against other language independent methods based on single words and bigrams. Testing our new method against the manually tagged multilingual corpus Acquis Communautaire 3.0 (AC) using all descriptors found there, we attain improvements in keyword assignment precision from 18 to 29 percent and in F-measure from 17.2 to 27.6 for 5 keywords assigned to a document. The further filtering out of the top 10 frequent items improves precision by 4 percent and collocation segmentation improves precision by 9 percent on the average, over 21 languages tested
Internet: https://hdl.handle.net/20.500.12259/50111
Affiliation(s): Informatikos fakultetas
Sistemų analizės katedra
Vytauto Didžiojo universitetas
Appears in Collections:Universiteto mokslo publikacijos / University Research Publications

Files in This Item:
marc.xml10.3 kBXMLView/Open

MARC21 XML metadata

Show full item record
Export via OAI-PMH Interface in XML Formats
Export to Other Non-XML Formats


CORE Recommender

WEB OF SCIENCETM
Citations 5

2
checked on Sep 12, 2020

Page view(s)

126
checked on Mar 30, 2020

Download(s)

12
checked on Mar 30, 2020

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.