main studijos image
main studijos image

Information Retrieval

Description

The course is designed to discuss information modelling and text-based retrieval theory, learn about its applications. It discusses classical and modern techniques of knowledge modelling and their application for information retrieval, as well as the latest trends: new data mining methods and applications. Studying begins with basic notions and concepts, finishes with the latest challenges. The knowledge provided during the course allows the student to critically evaluate existing search engine models, use them systematically and purposefully, and, after analysis, to create their own working prototypes.

Aim of the course

Learn how to create information retrieval engines. Acquire knowledge of information modelling processes and principles and be able to apply them in the implementation of existing information retrieval systems.

Prerequisites

Basic knowledge of statistics, set theory, logic, algebra, and software development.

Course content

1. Introduction to advanced information retrieval; 2. Complex methods of text analysis in the search for information; 3. Indexing of big data is distributed; 4. Extensions of logical and vector IR models; 5. Probability IR model; 6. Latent semantic and other IR models; 7. Behavioral analysis, complex methods of quality assessment; 8. Web search: scanning tools, processing, HITS, PageRank; 9. Clustering: flat, hierarchical, big data clustering – character building and selection; 10. Classification IR: Na?ve Bayes, SVM; 11. Focus search; 12. The impact of large language models on future search engines.

Assesment Criteria

1. Students demonstrate knowledge of IR models and language technologies, choose the most suitable methods to solve the problem. 2. Students demonstrate knowledge of different IR interfaces, user types, and can design relevant systems. 3. Students know where to find the latest IR research results, assess their maturity, know how to combine, create, and apply them. 4. Students demonstrate knowledge of different IR models and data/information analysis methods and can apply them to multimodal content. 5. Students demonstrate knowledge of formal IR models and their assessment, can choose the right methods for IR systems. 6. Students demonstrate knowledge of distributed data processing methods and algorithms and can apply them in practice. 7. Students demonstrate the knowledge of multiplatform and cloud computing services, can apply them to the development of IR solutions.