¹û¶³Ó°Ôº

XClose

¹û¶³Ó°Ôº Module Catalogue

Home
Menu

Information Retrieval and Data Mining (COMP0084)

Key information

Faculty
Faculty of Engineering Sciences
Teaching department
Computer Science
Credit value
15
Restrictions
Module delivery for UG Masters (FHEQ Level 7) available on MEng Computer Science; MEng Mathematical Computation. Module delivery for PGT (FHEQ Level 7) available on MSc Artificial Intelligence and Data Engineering; MSc Artificial Intelligence for Biomedicine and Healthcare; MSc Artificial Intelligence for Sustainable Development; MSc Computational Statistics and Machine Learning; MSc Data Science and Machine Learning; MSc Machine Learning; MSc Software Systems Engineering; MSc Data Science; MSc Scientific and Data Intensive Computing; MRes Artificial Intelligence Enabled Healthcare.
Timetable

Alternative credit options

There are no alternative credit options available for this module.

Description

Aims:

The module is aimed at an entry level study of information retrieval and data mining techniques. It is about how to find relevant information and subsequently extract meaningful patterns out of it. While the basic theories and mathematical models of information retrieval and data mining are covered, the course is primarily focused on practical algorithms of textual document indexing, relevance ranking, web usage mining, text analytics, as well as their performance evaluations.

Intended learning outcomes:

On successful completion of the module, a student will be able to:

  1. Understand the common algorithms and techniques for information retrieval (document indexing and retrieval, query processing, etc).
  2. Understand the quantitative evaluation methods for the IR systems and data mining techniques.
  3. UnderstandÌýthe popular probabilistic retrieval methods and ranking principles.
  4. UnderstandÌýthe techniques and algorithms existing in practical retrieval and data mining systems such as those in web search engines and recommender systems, including the recently popular topic of deep learning.
  5. UnderstandÌýbasic algorithms that can be used to make predictions out of data.

Indicative content:

The following are indicative of the topics the module will typically cover:

Overview of the fields:

  • Study some basic concepts of information retrieval and data mining, such as the concept of relevance, association rules, and knowledge discovery. Understand the conceptual models of an information retrieval and knowledge discovery system.

Indexing and Text Processing:

  • Introduce various indexing techniques for textual information items, such as inverted indices, tokenization, stemming and stop words. Techniques used for text compression, such as the Lempel-ziv algorithm and Huffman Coding will be covered.

Retrieval Methods:

  • Study popular retrieval models (boolean, vector space, binary independence, language modelling). Probability ranking principle. Other commonly-used techniques such as relevance feedback, pseudo relevance feedback, and query expansion will also be covered;

Measurements:

  • Online and offline Evaluation techniques to evaluate retrieval quality. Commonly used evaluation metrics such as average precision, NDCG, etc. Cranfield Paradigm and TREC conferences, as well as some recently popular techniques such as interleaving will be discussed;

Data Mining:

  • Study basic techniques, algorithms, and systems of data mining and analytics, including frequent pattern and correlation and association analysis, basic machine learning algorithms such as linear regression and logistic regression. Discussion on basic personalisation and usage mining techniques.

Emerging Areas:

  • Study new emerging areas such as learning to rank, deep learning, word embeddings and topic modelling.

Requisites:

To be eligible to select this module as an optional or elective, a student must: (1) be registered on a programme and year of study for which it is formally available; (2) have an understanding of probability and statistics; and (3) have proficiency in Java or Python programming (as demonstrated by a least one programming project in the past).

Module deliveries for 2024/25 academic year

Intended teaching term: Term 2 ÌýÌýÌý Postgraduate (FHEQ Level 7)

Teaching and assessment

Mode of study
In person
Methods of assessment
100% Coursework
Mark scheme
Numeric Marks

Other information

Number of students on module in previous year
92
Module leader
Professor Ingemar Cox
Who to contact for more information
cs.pgt-students@ucl.ac.uk

Intended teaching term: Term 2 ÌýÌýÌý Undergraduate (FHEQ Level 7)

Teaching and assessment

Mode of study
In person
Methods of assessment
100% Coursework
Mark scheme
Numeric Marks

Other information

Number of students on module in previous year
20
Module leader
Professor Ingemar Cox
Who to contact for more information
cs.pgt-students@ucl.ac.uk

Last updated

This module description was last updated on 8th April 2024.

Ìý