Slides of my recent talks are available below.

Talk given on February 28st at the Machine Learning Group of UCL. Slides and abstract:

Graphs are commonly used to represent interactions between entities. In numerous situations, those interactions can be repeated and happen at specific times. This leads naturally to the concept of temporal graphs. In general, those graphs are represented as a time series of static graphs using a crude time quantization technique: the data analyst chooses a time scale and disregards temporal information at a finer scale than the chosen one. This is done by aggregating all interactions that happen during each time period of this scale into a static graph. For instance, one can produce daily interaction graphs. While this approach can be used to produce interesting results, it cannot adapt to more complex schemes where a single time scale cannot capture the full temporal dynamic of the interactions. It is also blind to changes happening at a finer scale than the chosen one and is highly dependent to this choice. In this presentation, I will describe a generative model that does not use a time series representation of a temporal graph but rather addresses directly the temporal structure. The model is based on the principle of the stochastic block model in which the interactions between two entities (vertices in the graph) depend only on their hidden classes. I will extend this principle from the static setting to a temporal one by describing interactions between two classes of vertices via a non homogeneous Poisson point process (NHPPP). The complexity of those NHPPPs will be controlled by enforcing a piecewise constant structure on the intensity functions, with globally shared intervals. As a consequence, the estimation of the model will provide both a clustering structure for the vertices and time intervals in which all the NHPPPs will have constant but distinct intensities. This latter structure could be used to produce a time series of graphs with stationary interaction structure, leading to an automated local time scale analysis. I will conclude the presentation with examples of results obtained on real world data and with mentions of extensions to interactions with attached contents.

Plenary talk given on December 1st, at the 3rd Big Data Conference of Linnaeus University. Slides and abstract:

Making sense of medium to large data sets remains a very difficult challenge, especially when both the number of objects and the number of instances are large. The classical way of exploring such data sets remains a combination of clustering methods and low dimensional visual representations. Clustering methods are used to group similar objects while low dimensional visual representations enable the analyst to make sense of the relationships between clusters. However, truly high dimensional data sets cannot be represented faithfully in low dimension, a fact that strongly limits the practical usefulness of this standard analysis methodology on modern data sets. A potential solution is offered by the co-clustering framework in which both objects and variables are summarized. The main advantage of clustering variables rather than trying to build a low dimensional representation is that the former scales easily to complex data with high intrinsic dimension. However, most co-clustering methods cannot handle large data sets or mixed data sets (with numerical and categorical variables). I will present in this talk a general principle based on grid modeling which can be used in particular to circumvent the limitations of co-clustering and thus to explore medium to large scale data sets. I will first present the general idea of generative modeling, then introduce our non parametric generative model. I will give examples of the way the general idea can be adapted to different settings. The last part of the talk will be focused on the co-clustering case.