Use this url to cite researcher: https://hdl.handle.net/20.500.12259/154979
Mokslų daktaras (2000 m.) / Doctor of Sciences
Profesorius (2014 m.) / Professor
Now showing 1 - 10 of 59
- CLARIN-LT consortium is one of the leading Lithuanian language re-search and digital data storage infrastructures. This chapter will present outreach and initiatives performed by or in cooperation with the CLARIN-LT consortium and highlight their most significant outcomes. We will first highlight some of the resources stored in the CLARIN-LT repository and present their usage statistics. Next, we will show a use case of scientific outreach, followed by a success story involving the cooperation of large-scale national projects and CLARIN-LT in the development of IT services for Lithuanian. Finally, we will demonstrate an example of CLARIN content integration in university classes. The initiatives we overview here, although they have different aims and audiences, share one common feature – they all found a home at the CLARIN-LT repository. The presented use cases and success stories performed by or in cooperation with the CLARIN-LT consortium during the relatively short period of time since its establishment in 2015 show that the infra-structure is gaining recognition and is increasingly being addressed by scientific, educational, public, and private communities.
34Scopus© Citations 4 Dabartinės lietuvių kalbos tarties žodynas [elektroninis išteklius]Publication[Dictionary of the contemporary Lithuanian pronunciation]The dictionary contains almost 150,000 stressed and transcribed words. The introductory parts briefly discuss the general features of the sound units of Lithuanian (vowels, consonants, and diphthongs) and the peculiarities of the pronunciation of some sounds in cohesive language. The dictionary will be useful for everyone who is learning the correct pronunciation of the Standard Lithuanian language. 23 Comparison of phonemic and graphemic word to sub-word unit mappings for Lithuanian phone-level speech transcriptionPublicationConventional large vocabulary automatic speech recognition (ASR) systems require a mapping from words into sub-word units to generalize over the words that were absent in the training data and to enable the robust estimation of acoustic model parameters. This paper surveys the research done during the last 15 years on the topic of word to sub-word mappings for Lithuanian ASR systems. It also compares various phoneme and grapheme based mappings across a broad range of acoustic modelling techniques including monophone and triphone based Hidden Markov models (HMM), speaker adaptively trained HMMs, subspace gaussian mixture models (SGMM), feed-forward time delay neural network (TDNN), and state-of-the-art low frame rate bidirectional long short term memory (LFR BLSTM) recurrent deep neural network. Experimental comparisons are based on a 50-hour speech corpus. This paper shows that the best phone-based mapping significantly outperforms a grapheme-based mapping. It also shows that the lowest phone error rate of an ASR system is achieved by the phoneme-based lexicon that explicitly models syllable stress and represents diphthongs as single phonetic units. 80WOS© Citations 1WOS© IF 3.312WOS© AIF 2.433Scopus© SNIP 0.822 Corpus-based hidden Markov modelling of the fundamental frequency of LithuanianPublication[Lietuvių kalbos pagrindinio tono kaitos prognozė, naudojant paslėptųjų Markovo modelių metodiką]This paper presents the corpus-driven approach in building the computational model of fundamental frequency, or F_0, for Lithuanian language. The model was obtained by training the HMM-based speech synthesis system HTS on six hours of speech coming from multiple speakers. Several gender specific models, using different parameters and different contextual factors, were investigated. The models were evaluated by synthesizing F_0 contours and by comparing them to the original F_0 contours using criteria of root mean square error (RMSE) and voicing classification error. The HMM-based models showed an improvement of the RMSE over the mean-based model that predicted F_0 of the vowel on the basis of its average normalized pitch. 104Scopus© Citations 2WOS© Citations 3WOS© IF 1.056WOS© AIF 1.768Scopus© SNIP 1.006
- This paper presents the recently developed medical-pharmaceutical informative system with voice user interface. This is the first computerized system oriented towards healthcare services and industry where Lithuanian voice commands are used as a primary mean for control. Another essential property of the developed system is its hybrid nature: two different recognizers - an adapted commercial Spanish speech recognizer available from Microsoft and a locally developed HMM speech recognizer based on Lithuanian acoustic models – are operating in parallel. The recognition hypotheses produced by those recognizers are joined together using logical rules obtained using decision rules induction algorithms such as Ripper. All these measures and approaches allowed achieve very high speaker independent voice commands recognition accuracy acceptable for the system implementation in practice. The best achieved recognition was 98.9 % for 1000 Lithuanian voice commands. The paper presents optimization issues related with the development of the system.
99Scopus© Citations 9WOS© Citations 9WOS© IF 0.561WOS© AIF 1.798Scopus© SNIP 0.64 Medical – pharmaceutical information system with recognition of Lithuanian voice commandsPublicationresearch article ;Rudžionis, Vytautas ; ;Ratkevičius, Kastytis ;Rudžionis, AlgimantasBartišiūtė, GintarėHuman language technologies - the Baltic perspective : proceedings of the 6th international conference, Baltic HLT 2014. Amsterdam : IOS Press, 2014, p. 40-45This paper presents a Lithuanian voice recognition system of medical - pharmaceutical terms. The system consists of two separate speech recognition modules working in parallel. One recognizer is a proprietary CD-HMM Lithuanian speech recognizer. The second recognizer is a Spanish speech recognizer adapted to recognize Lithuanian voice commands. The outputs of both recognizers are combined by the decision making block yielding the final decision. The decision making block was automatically derived by an induction algorithm that learns a set of symbolic rules. The investigations showed that both recognizers produce uncorrelated outputs and could complement each other. The investigations also showed that Lithuanian speech recognizer achieves higher accuracy (over 96 percent in a speaker independent mode) but the use of the adapted foreign language recognizer allows increase this baseline accuracy even further (over 98 percent in a speaker independent mode for 1000 voice commands). The voice recognition system is in the process of being embedded into several medical information systems which will be used by healthcare practitioners. 111Scopus© Citations 2WOS© Citations 1Scopus© SNIP 0.573
- research article
;NODALIDA 2013 : proceedings of the 19th Nordic conference of computational linguistics, May 22–24, 2013, Oslo university, Norway / eds. Stephan Oepen, Kristin Hagen, Janne Bondi Johannesse. Linköping : Linköping University Electronic Press, 2013, p. 353-363This paper presents our research in preparation to compile a Lithuanian intonation corpus. The main objective of this research was to discover characteristic patterns of Lithuanian intonation through clustering of pitch contours of intermediate intonation phrases. The paper covers the set of procedures that were used to extend an ordinary speech corpus to make it suitable for intonation analysis. The process of intonation analysis included pitch extraction, pitch normalization, estimation of the representative frequency of a syllable, measurement of an inter-phrase similarity, k-means phrase clustering, and visualisation of clustering results. These computational procedures were applied to 23 hours of read speech containing 41417 phrases. The clustering results revealed some interesting intonation patterns of Lithuanian that could be related to the well known linguistic-prosodic phenomena. Language-independence is an important feature of computational procedures covered by this paper. If speech waveforms and the knowledge of phone and phrase boundaries are given, these procedures can be used for the analysis of intonation of other languages. 30 102 Comparative analysis of adapted foreign language and native Lithuanian speech recognizers for voice user interfacePublicationPaper presents research results obtained when building a speaker independent hybrid speech recognizer. This recognizer will be integrated as a phrase recognizer in a medical-pharmaceutical information system. The hybrid speech recognizer consists of two recognition components: an adapted commercial Microsoft Spanish speech recognizer and a locally developed hidden Markov models based recognizer implementing Lithuanian acoustic models. Efficiency of both recognition components was evaluated on multiple speaker independent speech recognition tasks. The average accuracy of Lithuanian recognizer was higher reaching 0.6% phrase error rate for user requests in medical-pharmaceutical domain. The adapted commercial Spanish speech recognizer showed the ability to improve the accuracy of Lithuanian recognizer in the worst recognition scenarios. These results proved the hypothesis formulated when proposing the basic idea of hybrid recognition approach: recognition errors from different recognizers built using various techniques are not strongly correlated. This fact could be exploited for improved overall speech recognition accuracy. 89WOS© IF 0.445WOS© AIF 1.812Scopus© SNIP 0.65 Intonuoto garsyno kūrimo principaiPublication[Principles of development of the intonational annotated spoken corpus]The language manifests itself in both spoken and v^ritten forms. Spoken and written language forms are different in many linguistic respects and in the methods and tools they are acquired and analyzed. Intonation is one ofthe most important phenomena of a spoken language. It comprises the segmentation of speech into meaningful units, emphasis of key words, fiuctuation in speech tempo, expression of emotions. Intonation is poorly represented by word orthography and thus poses many problems for intonation researchers. Specially prepared intonational speech corpus is a prerequisite for any serious intonation research. The process of compiling speech corpus can be divided into a few steps: a) acquisition and recording of prosody-rich utterances of spoken language b) description ofthe content of these utterances and utterance markup with tags that describe prosodie features e) automatically assigning timings to prosodie features (as a result of phone level annotation of intonational speech corpus). Every step in this process requires certain procedures to be observed and certain requirements to be met. Linguists of the world have built more than one intonational corpus, some common methodologies have been developed (this allows intonational features of different languages to be compared). This paper describes and discusses the process of building intonational annotated speech corpus of Lithuanian: tagging and labeling linguistic end extra-linguistic phenomena, cliticization, markup of phrase and sentence boundaries, determining and labeling logical stress, mark-up of fundamental frequency. 16 54
- Lietuvių kalbos morfemikos duomenų bazę sudaro tiriamoji medžiaga, kuri apima apie 310 tūkst. rašytinės ir sakytinės kalbos žodžių. Tiriamoji medžiaga sudaryta iš skirtingų stilių, kuo įvairesnės tematikos, kiek įmanoma panašesnės apimties tekstų atkarpų iš mokslinio, publicistinio, grožinio stiliaus darbų ir šiek tiek mažesnės apimties administracinės kalbos pavyzdžių. Į tiriamąją medžiagą įdėta ir eksperimentinės sakytinės kalbos bazės fragmentų. Visi duomenų bazę sudarantys tekstai morfologiškai anotuoti (t. y. nustatytos kalbos dalys ir joms būdingos gramatinės žymos), visi žodžiai suskaidyti morfemiškai. Internete prieinamoje duomenų bazėje ribos tarp morfemų žymimos brūkšneliais. Visiems duomenų bazėje esantiems žodžiams nurodoma: lema, t. y. antraštinė (žodyninė) forma; gramatinė informacija, t. y. kalbos dalis, giminė, skaičius, laikas, asmuo ir pan.; dažnumas. Tai naujoviškas lietuvių kalbos tyrimas, nes iki šiol buvo skiriamas dėmesys arba gramatiniams žodžių santykiams, arba žodžių darybai, o morfotaktika (morfemų išsidėstymo žodyje dėsningumai) išsamiai nenagrinėta.