2012 is nearing its end, at least on a publication point of view. I've still at least one pending paper (to be published in the Co-Clustering workshop of IEEE ICDM 2012 in December), but most of my 2012 papers are already published and available on my publication page. This is a good timing for a summary of my recent production.
Most of my work published in 2012 has been dedicated to clustering. I've been working with PhD student Matthieu Durut on Cloud implementation of online versions of the k-means and with my former PhD student Brieuc Conan-Guez on clustering of dissimilarity data. For this last work, we leveraged our former studies on graph clustering to use multilevel refinement tricks in the dissimilarity context. I hope to keep on working on this direction in the following months.
An example of multilevel refinement: the two classes top level partition is modified using sub-clusters at different levels in the clustering hierarchy.
I've been advising Mohamed Khalil El Mahrsi since early 2011. We work on trajectory clustering with a focus on objects (e.g., cars) moving on a constrained network. We had a series of publications in 2012:
We are now working on co-clustering of this type of data.
Road occupations are color coded using a standard heat colormap (red is for saturated roads, grey for unused ones). The width of the road is also proportional to its occupation.
The display uses a similar colormap but with local weighting, that is red corresponds here to the most used roads by the trajectories of this cluster.
I've been working on functional data since quite a long time (2000, roughly). My most recent paper on the subject is with my PhD student Romain Guigourès and his adviser at Orange Labs Marc Boullé. It's an application of Marc's MODL method to unsupervised learning, more precisely to clustering functional data. The method is highly efficient and provides density estimates in a completely parameter less way. A drawback of MODL is its ability to find subtle patterns which are not always easy to interpret. Romain has been working on ways to circumvent this problem and to enable exploratory analysis based on MODL's proved density estimation quality.
We have also applied MODL to graph clustering. We consider in particular the case of temporal graphs: we have a fixed set of actors (the vertices of the graph) that interact at certain times. Each (directed) interaction is timestamped (we allow as many interactions as needed between two given actors, that is we work with directed multi-graphs). Using MODL, we build a block model for temporal graphs which clusters actors as sources of interaction and actors as receivers of interaction, and segments simultaneously the time line into intervals. This work will be presented at the Co-clustering workshop in ICDM 2012 (stay tuned for the English paper!).
I've not published anything in information visualization this year (even if the work with Romain includes nice and rather original visual representations of our results), but I've co-organized in February the Dagstuhl seminar on Information Visualization, Visual Data Mining and Machine Learning. This was a unique event which gathers specialists of information visualization and of machine learning in the perfect Dagstuhl environment. A summary of our work is available as a Dagstuhl report and the seminar was presented in Informatik-Spektrum.
I've also published in 2012 a chapter on information propagation in social networks. This is a quite unique work in the sense that it is based on polls and interviews conducted in France with a longitudinal study on a very large set of 4500 persons. Those data enabled us to design a local propagation model that was then implemented on random graphs to study its global properties.
Finally, I had the opportunity to participate to a survey article on neural networks for complex data in which I described, among others, my work on functional data and on dissimilarity data.
Published
27 October 2012
Tags