Statistique - Probabilités - Optimisation et Contrôle

Igor Melnykov (University of Minnesota Duluth) "Finite mixture models in semi-supervised clustering with hard constraints"

Salle René Baire (IMB)

Salle René Baire



Finite mixtures of probability distributions provide a powerful and flexible mechanism for modeling real-life data that cannot be easily approximated by elementary distributions. Among other applications, mixture models found their use in representing groups of observations in cluster analysis. Unsupervised clustering takes place without any restrictions on the membership of points in the partition, while semi-supervised clustering occurs in the presence of some partial information about the membership. We consider a scenario where the partition is determined with two specific types of strict constraints that can be imposed on the solution. Under positive constraints, certain points are joined together so that they must belong to the same cluster. With negative constraints in place, the points are prevented from being in the same class. We work on an approach that accommodates both negative and positive constraints in the setting of model-based clustering and consider the changes that need to be made in the implementation of the EM algorithm compared to the unsupervised case when finite mixtures are employed.