Annotation of cybersecurity terminology: methodology, problems and results

Rackevičienė, Sigita; Utka, Andrius; Bielinskienė, Agnė; Rokas, Aivaras

Use this url to cite publication: https://hdl.handle.net/20.500.12259/145417

Annotation of cybersecurity terminology: methodology, problems and results

Type of publication

Konferencijų tezės nerecenzuojamame leidinyje / Conference theses in non-peer-reviewed publication (T2)

Author(s)

Author	Affiliation
Rackevičienė, Sigita

Title

Annotation of cybersecurity terminology: methodology, problems and results

[en]

Is part of

Moksliniai, administraciniai ir edukaciniai terminologijos lygmenys = Scientific, administrative and educational dimensions of terminology : 4-oji tarptautinė mokslinė terminologijos konferencija, 2021 m. spalio 21–22 d., Vilnius: tezės. Vilnius: Lietuvių kalbos institutas, 2021

Date Issued

Date
2021

Publisher

Vilnius: Lietuvių kalbos institutas, 2021

Extent

p. 28-29

URI

URI
http://lki.lt/wp-content/uploads/2021/10/Terminologijos-konferencijos-tezes_2021.pdf
https://hdl.handle.net/20.500.12259/145417

Field of Science

Abstract (en)

Currently, most terminology extraction projects are based on deep learning systems, the development of which depends on big amounts of texts and training data. The latter are obtained by manually annotating terminology used in domain-specific texts. Annotation is usually performed by terminology researchers in cooperation with domain experts. The presentation presents the monolingual and bilingual terminology annotation methodology which has been used for annotation of the terms of the domain of cybersecurity (CS), the problems which have occurred during the annotation and the initial results. For the purposes of the annotation work, the special software QuickTag has been developed. The software provides a toolkit for annotation of terms and appellations used in monolingual texts and bilingual parallel texts. Functionalities of the software allow adding various types of metadata about lexical units used in coherent texts. Firstly, the main annotation function allows tagging terms and appellations with the pre-existing tags indicating their conceptual characteristics: terms of the CS domain, terms related to the CS domain and appellations of the CS domain. Appellations can be additionally tagged with the tags indicating their semantic classes according to the nature of the referent (documents, institutions, software, etc.). Secondly, QuickTag allows adding metadata about certain usage- and formation-related features of the tagged lexical units, e. g. an annotator can indicate a full term form of the tagged abbreviated term, specify formation type of the term or its origin. [...]

Type of document

type::text::conference output::conference proceedings::conference paper

Language

Anglų / English (en)

Coverage Spatial

Lietuva / Lithuania (LT)

Other Identifier(s)

VDU02-000068457

Project(s)

Dvikalbis automatinis terminų atpažinimas

European network for Web-centred linguistic data science

Kompiuterinės lingvistikos centras / Centre of Computational Linguistics