PhD Project Themes

Developing a Data Mining Method for Data Diffusion Detection

Big Data Analytics refers to the analytics/data mining of large and complex datasets. Special efficient algorithms are needed to process and analyse Big Data. This project is mainly concerned with the development of an analytics methodology for diffusion detection of spatial-temporal data. Loosely speaking, it is about the development of a method that enables the detection of the diffusion of events (the directions in which events spread) over time. A possible case study for this is detecting how crime or certain patterns of crime spread geographically over a certain period of time. However, alternative case studies may be proposed by the applicant and will be considered.

For enquiries please contact me here.

You can find relevant publications for the project here.

Find out how to apply here.

Real-time Parallel Big Data Analysis

The Real-time Parallel Big Data Analysis project is concerned with the development of scalable Big Data Analytics techniques for fast data streams. There is significant activity going on in Big Data Analytics throughout the University, as Big Data techniques underpin large areas of the University’s research activity in much the same way that the research platforms underpin experimental research.

This project is intended to be a groundwork contribution to the field of Big Data Analytics. In the digital universe large quantities of data are generated in real-time and hence need to be analysed on the fly, which imposes serious computational constraints in terms of memory and processing time consumption. The research area of Data Stream Mining (DSM) deals with these constraints by developing fast and adaptive data mining techniques, that cope with the speed with which data is generated and its variability (by adapting the analysis techniques to changes in the data). However, DSM is in its infancy state. Scalability is only addressed through the use of single pass algorithms rather than much iteration over the same data. However, there are many applications, some of which are outlined below, where these DSM algorithms fail to scale well, due to the velocity of the incoming data. This project aims to remove this bottleneck through the development of parallel and adaptive real-time data analytics algorithms and techniques.

For enquiries please contact me here.

You can find relevant publications for the project here.

Find out how to apply here.

Automatic Classification of Data Streams with Sparse Class Labels

Data Stream Mining has become a hot topic and is concerned with the analytics of data that arrives in real-time and at a fast speed. Two general challenges in Data Stream Mining are (1) the data stream is infinite and storing the data and learning off line is not possible and (2) the pattern in the data may change over time (known as concept drift). Challenge (1) is typically met through algorithms that only need one pass through the data; and challenge (2) is typically met through frequent feedback about the pattern and thus changes of pattern encoded in the stream.

This project will explore predictive data stream mining applications where this change of patterns encoded in the stream in less explicit and sudden and thus challenging to realise, i.e. prediction of traffic congestion or twitter news feeds. In order to address this challenge, this PhD project will develop new un-supervised or semi-supervised Data Stream Mining algorithms and workflows and evaluate them on concrete case studies.

For enquiries please contact me here.

you can find relevant publications for the project here.

Find out how to apply here.

%d bloggers like this: