In approximative Bayesian computation (ABC), the similarity, or “distance”, between two data sets is usually measured by the Euclidean distance of some summary statistics. Both the Euclidean distance used and the summary statistics are critical for the success of the algorithm. Presently, researchers choose this two quantities subjectively. The goal of our work is to automate this choice. The benefit of an automated choice is that ABC becomes easier to use for a non-specialist. Furthermore, it protects the researchers against bad choices. Our approach is to take the discriminability (classifiability) between the observed data set and the artificially generated one as similarity measure. By doing this, we reduce the problem of choosing an appropriate distance function and summary statistics to classification problem, where we can leverage on existing solutions. We show the applicability of our approach on both simulated and real data.

This is joint work with Ritabrata Dutta, Samuel Kaski, Jukka Corander

April 26, 2014 at 14:22

[…] University of Helsinki pointed out to me at the end of my talk, there are similarities between the classification method he exposed at MCMSki 4 in Chamonix and our use of random forests. Before my talk, I attended the […]