Vytautas Magnus University Research Management System (VDU CRIS)





Use this url to cite project: https://hdl.handle.net/20.500.12259/155414
Now showing1 - 2 of 2
  • Item type:Dataset,
    Lietuvių kalbos mokinių tekstynas
    [Mokinių tekstynas, MOKATE]
    dataset[2021][K5][H004,N009][1274]
    ;
    ;
    ;
    Kaunas : Vytauto Didžiojo universitetas, 2021

    Mokinių tekstyną sudaro keturi kitakalbių tekstų rinkiniai, kurie buvo renkami bendradarbiaujant su baltistikos centrais. Kiekviena tekstyno dalis iliustruoja skirtingą lietuvių kalbos mokėjimo lygį. Besimokantiems svetimosios kalbos svarbu ne tik žinoti, kaip kalba gimtakalbiai, bet ir kokias kategorijas sunkiau ar lengviau įsisavina kitakalbiai. Tokia priemonė svarbi ir kalbos mokytojams, nes tekstynai leidžia kalbos mokymą grįsti duomenimis, pvz., užsiėmimų metu tipinę kitakalbių vartoseną galima aptarti su studentais, naudoti kaip medžiagą įvairaus pobūdžio užduotims rengti, baziniams žodžių sąrašams sudaryti. Mokinių tekstyną vartotojai galės analizuoti ir naršyti internete arba parsisiųsti kai kuriuos duomenis (pvz., konkordanso eilutes, dažninius žodžių sąrašus) į savo kompiuterius. DOI: https://kalbu.vdu.lt/mokymosi-priemones/mokiniu-tekstynas/

      94
  • Item type:Dataset,
    Pedagogic corpus of Lithuanian
    [Mokomasis lietuvių kalbos tekstynas]
    dataset[2022][S007,H004,N009]
    ;
    ;
    Aleksandravičiūtė, Gabrielė
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    Virbickienė, Gabrielė
    Vytauto Didžiojo universitetas / Vytautas Magnus University, 2022-08-29

    The Pedagogic Corpus of Lithuanian is a monolingual specialized corpus, prepared for learning and teaching Lithuanian in a foreign language classroom. The pedagogic corpus includes authentic Lithuanian texts, selected using such criteria as a learner-relevant communicative function and genre. Spoken language as well as written language are represented in the corpus. The size of the corpus is 669,000 tokens: 111,000 tokens from texts and spoken language for A1-A2 levels, 558,000 tokens from texts and spoken language for B1-B2 levels (according to the Common European Framework of Reference for Languages). The spoken component constitutes appr. 7.5 % of the Corpus. The written subpart of the corpus (containing 620,000 tokens) includes levelled texts from coursebooks and unlevelled texts from other sources. The texts from coursebooks and other sources could be classified into 29 text types (dialogs, narratives, information, etc.) and 4 groups according to the communicative aims: informational texts, educational texts, advertising and fiction. There are two types of searches in the corpus: simple and advanced (see „Search Tips“). Simple Search allows you to find instances of a search item (word form, lemma, two words) in the whole corpus, or particular part of the corpus (spoken or written texts). After selecting the written subcorpus, you can further select the text type (coursebooks or non-coursebook texts) and/or the genre of the written texts. Advanced Search allows you to use all the features of simple search and find some additional options. Since the Pedagogic Corpus is morphologically annotated, the advanced search allows you to search by grammatical features (e.g. part of speech, case, number, verb form, etc.). At https://kalbu.vdu.lt/mokymosi-priemones/mokomasis-tekstynas/ you can find truncated wordlists: list of lemmas, word forms (for the whole corpus, spoken and written components, and for each level), lists of particular part of speech in the whole corpus. The lists can be downloaded as .xlsx files. REFERENCE Kovalevskaitė, Jolanta and Rimkutė, Erika. "Pedagogic Corpus of Lithuanian: A New Resource for Learning and Teaching Lithuanian as a Foreign Language" Sustainable Multilingualism, vol.17, no.1, 2020, pp.197-230. https://doi.org/10.2478/sm-2020-0019

      142