Ritabrata Dutta (Aalto University): Sequential Mixture Models on Model Space: Retrieval of Experiments

Sequential learning with underlying mixture models has become a recent topic of interest in machine learning with the application of DPM mixture models in online learning. Existing methods trying to solve DPM forbig datasets or sequentially arriving datasets, suffer from its dependence on the ordering of the observations. The particle learning approach towards mixture models solves this issues through resampling and sharing of information between mutilple particles. Here we try to apply sequential learning for retrieving relevant experiments given a query experiment, motivated by the public databases of datasets in molecular biology and other experimental sciences, and the need of scientists to relate to earlier work on the level of actual measurement data. We formulate retrieval as a “supermodelling'' problem, of sequentially learning a model of the set of posterior distributions, represented as sets of MCMC samples. In our previous works, we have already shown this approach can sucessfully retrieve in rather difficult biological problems. In this paper we extend our previous work by assuming batch (MCMC samples from each experiment) specific sparsity and modeling the existing MCMC samples by an underlying Hierarchical Dirichlet Process (HDP). We extend the particle learning approach for HDP and compare with other online HDP schemes. We show our HDP based particle learning method works good for the Experiment Retrieval Scenario.

This is joint work with Samuel Kaski