Use this url to cite publication: https://hdl.handle.net/20.500.12259/41902
Review of statistical modeling of highly inflected Lithuanian using very large vocabulary
Type of publication
Straipsnis konferencijos medžiagoje Scopus duomenų bazėje / Article in conference proceedings in Scopus database (P1a2)
Title [en]
Review of statistical modeling of highly inflected Lithuanian using very large vocabulary
Is part of
Interspeech'2005 - Eurospeech : Proceedings of the 9th European conference on speech communication and technology, Lisbon, Portugal, September 4-8, 2005. Lisbon, Portugal : ISCA, 2005, [no. 9]
Date Issued
Date |
---|
2005 |
Publisher
Lisbon, Portugal : ISCA
Is Referenced by
Extent
p. 1321-1324
Abstract (en)
This paper presents state of the art language modeling (LM) of Lithuanian, which is highly inflected free word order language. Perplexities and word error rates (WER) of standard n-gram, class-based, cache-based, topic mixture and morphological LMs were estimated and compared for the vocabulary of more than 1 million words. WER estimates were obtained by solving a speakerdependent ASR task where LMs were used to rescore acoustical hypothesis. LM perplexity appeared to be uncorrelated with WER. Cache-based language models resulted in the greatest perplexity improvement, while class-based language models achieved the greatest though insignificant WER improvement over the baseline 3-gram.
Type of document
type::text::journal::journal article::research article
Language
Anglų / English (en)
Coverage Spatial
Portugalija / Portugal (PT)
ISSN (of the container)
1018-4074
Other Identifier(s)
VDU02-000006650