My PhD student Mohamed Khalil El Mahrsi has been studying trajectories of mobile objects as obtainable from GPS devices. He worked first on on-the-fly trajectory approximation in which one discards incoming positions in order to avoid saturating the device link and/or the trajectory database. He then moved to trajectory analysis, in a data mining sense. In this context, we considered constrained trajectories, that is movements on a road network. This gives a sort of grammar to trajectories as their structure is more or less imposed by the underlying network. Inspired by this analogy, we followed a text mining general approach coupled with graph mining algorithms (what a mix ;-). This gave interesting results in terms of recovering movement patterns that could be used for e.g. city planning. All of this is covered by the following publications:
took place on the 30th of September. Khalil gave an excellent speech in front of a large jury made of:
and myself.
The summary of the thesis follows:
In this thesis, we explore two problems related to managing and mining moving object trajectories.
First, we study the problem of sampling trajectory data streams. Modern location-aware devices are capable of capturing and transmitting their position at very high rates. Storing the entirety of the trajectories provided by such devices can entail severe storage and processing overheads. Therefore, adapted sampling techniques are necessary in order to discard unneeded positions and reduce the size of the trajectories while still preserving their key spatiotemporal features. In streaming environments, this process needs to be conducted ``on-the-fly'' since the data are transient and arrive continuously. To this end, we introduce a new sampling algorithm called Spatiotemporal Stream Sampling (STSS). This algorithm is computationally-efficient and guarantees an upper bound for the approximation error introduced during the sampling process. Experimental results show that STSS achieves good performances and can compete with more sophisticated and costly approaches.
The second problem we study is clustering trajectory data in road network environments. Most of prior work assumed that moving objects can move freely in an Euclidean space and did not consider the presence of an underlying road network and its influence on evaluating the similarity between trajectories. We present three approaches to clustering such data: the first approach discovers clusters of trajectories that traveled along the same parts of the road network; the second approach is segment-oriented and aims to group together road segments based on trajectories that they have in common; the third approach combines both aspects and simultaneously clusters trajectories and road segments. Through extensive case studies, we show how these approaches can be used to reveal useful knowledge about flow dynamics and characterize traffic in road networks. We also provide experimental results where we evaluate the performances of our propositions.
Mohamed Khalil El Mahrsi, Analyse et fouille de données de trajectoires d’objets mobiles
The thesis is available on TEL here. It's very well written and a worth read, in my opinion (shared by the referees and by the overall jury).