Text documents clustering

Ciganaitė, Greta; Mackutė-Varoneckienė, Aušra; Krilavičius, Tomas

Use this url to cite publication: https://hdl.handle.net/20.500.12259/34329

Text documents clustering

Type of publication

Straipsnis recenzuojamoje Lietuvos konferencijos medžiagoje / Article in peer-reviewed Lithuanian conference proceedings (P1f)

Author(s)

Author	Affiliation
Ciganaitė, Greta	Informatikos fakultetas / Faculty of Informatics	LT
Mackutė-Varoneckienė, Aušra	Taikomosios informatikos katedra / Department of Applied Informatics	LT
Krilavičius, Tomas	Taikomosios informatikos katedra / Department of Applied Informatics	LT	Baltijos pažangių technologijų institutas, Vilnius	LT

Title

Text documents clustering

[en]

Is part of

Informacinės technologijos : 19-oji tarpuniversitetinė tarptautinė magistrantų ir doktorantų konferencija "Informacinė visuomenė ir universitetinės studijos" (IVUS 2014) : konferencijos pranešimų medžiaga. Kaunas : Technologija, 19 (2014)

Date Issued

Date
2014

Publisher

Kaunas : Technologija

Extent

p. 90-93

URI

URI
https://eltalpykla.vdu.lt/1/34329
https://hdl.handle.net/20.500.12259/34329

Field of Science

Keywords (lt)

Keywords (en)

Abstract (en)

Big amounts of textual information are generated every day, and existing techniques can hardly deal with such information flow. However, users expect fast and exact information management and retrieval tools. Clustering is a well known technique for grouping similar data and in such a way making it more manageable and usable. Text clustering is an adaptation of clustering for a very specific data - documents. However, it is not transferable directly to any language, i.e. specifics of language influence performance quite a lot, as shows results for English and other well investigated languages. In this paper we apply different distances and clustering approaches for Lithuanian data, discuss results and provide recommendations for documents in Lithuanian clustering.

Type of document

type::text::journal::journal article::research article

Language

Anglų / English (en)

Coverage Spatial

Lietuva / Lithuania (LT)

File(s)

ISSN2029-249X_2014.PG_90-93.pdf (360.39 KB)

ISSN (of the container)

2029-249X

Other Identifier(s)

VDU02-000017071

Access Rights

Atviroji prieiga / Open Access

Taikomosios informatikos katedra / Department of Applied Informatics

Informatikos fakultetas / Faculty of Informatics

Vytauto Didžiojo universitetas / Vytautas Magnus University