.. _bayesianwithpythonrst: =================================== 2A.ml - Bayesian models with Python =================================== .. only:: html **Links:** :download:`notebook `, :downloadlink:`html `, :download:`python `, :downloadlink:`slides `, :githublink:`GitHub|_doc/notebooks/2a/bayesian_with_python.ipynb|*` Modèles de mélanges de lois. Statistiques bayésiennes. *bayespy*, *scikit-learn*. .. code:: ipython3 from jyquickhelper import add_notebook_menu add_notebook_menu() .. contents:: :local: You can read `Probabilistic Programming and Bayesian Methods for Hackers `__. Results might be different between examples. The example used is the same but the default parameters the optimisation uses are different. We try different python model to deal with a Bayesian problem: a Gaussian Mixture. We will use the following example. .. code:: ipython3 %matplotlib inline .. code:: ipython3 import matplotlib.pyplot as plt .. code:: ipython3 import numpy as np y0 = np.random.multivariate_normal([0, 0], [[2, 0], [0, 0.1]], size=50) y1 = np.random.multivariate_normal([0, 0], [[0.1, 0], [0, 2]], size=50) y2 = np.random.multivariate_normal([5, 2], [[2, -1.5], [-1.5, 2]], size=50) y3 = np.random.multivariate_normal([-2, -2], [[0.5, 0], [0, 0.5]], size=50) y = np.vstack([y0, y1, y2, y3]) X=y fig, ax = plt.subplots(1, 1, figsize=(10, 7)) ax.plot(y[:,0], y[:,1], "o"); .. image:: bayesian_with_python_6_0.png .. raw:: html

bayespy .. raw:: html

The module `bayespy `__ allows to build and estimate simple bayesian models. I just replicate the example on the `Gaussian mixture model `__. We define the model: .. code:: ipython3 from bayespy import __version__ as v1 from numpy import __version__ as v2 v1, v2 .. parsed-literal:: ('0.5.22', '1.22.1') Si cela ne marche pas avec la version 1.14 de numpy, vous devriez essayez la version 1.13 (a priori, cette `exception `__ pose problème). .. code:: ipython3 N = 200 # number of data vectors D = 2 # dimension K = 10 # maximum number of clusters .. code:: ipython3 from bayespy.nodes import Dirichlet, Categorical, Gaussian, Wishart, Mixture alpha = Dirichlet(1e-5*np.ones(K), name='alpha') Z = Categorical(alpha, plates=(N,), name='z') mu = Gaussian(np.zeros(D), 1e-5*np.identity(D), plates=(K,), name='mu') sigma = Wishart(D, 1e-5*np.identity(D), plates=(K,), name='Lambda') Y = Mixture(Z, Gaussian, mu, sigma, name='Y') .. code:: ipython3 Z.initialize_from_random() .. code:: ipython3 from bayespy.inference import VB Q = VB(Y, mu, sigma, Z, alpha) .. code:: ipython3 Y.observe(y) .. code:: ipython3 import time if not hasattr(time, 'clock'): # bayespy still clock and it was removed in python 3.8 setattr(time, 'clock', time.perf_counter) .. code:: ipython3 Q.update(repeat=1000) .. parsed-literal:: Iteration 1: loglike=-1.497701e+03 (0.012 seconds) Iteration 2: loglike=-1.359319e+03 (0.009 seconds) Iteration 3: loglike=-1.345008e+03 (0.007 seconds) Iteration 4: loglike=-1.333128e+03 (0.007 seconds) Iteration 5: loglike=-1.321888e+03 (0.008 seconds) Iteration 6: loglike=-1.308002e+03 (0.007 seconds) Iteration 7: loglike=-1.290391e+03 (0.007 seconds) Iteration 8: loglike=-1.275647e+03 (0.006 seconds) Iteration 9: loglike=-1.264646e+03 (0.008 seconds) Iteration 10: loglike=-1.251051e+03 (0.006 seconds) Iteration 11: loglike=-1.229172e+03 (0.006 seconds) Iteration 12: loglike=-1.182531e+03 (0.006 seconds) Iteration 13: loglike=-1.162046e+03 (0.007 seconds) Iteration 14: loglike=-1.129390e+03 (0.008 seconds) Iteration 15: loglike=-1.114487e+03 (0.007 seconds) Iteration 16: loglike=-1.089347e+03 (0.008 seconds) Iteration 17: loglike=-1.086340e+03 (0.007 seconds) Iteration 18: loglike=-1.083643e+03 (0.008 seconds) Iteration 19: loglike=-1.080921e+03 (0.007 seconds) Iteration 20: loglike=-1.077116e+03 (0.008 seconds) Iteration 21: loglike=-1.064775e+03 (0.007 seconds) Iteration 22: loglike=-1.052069e+03 (0.006 seconds) Iteration 23: loglike=-1.031870e+03 (0.006 seconds) Iteration 24: loglike=-1.020058e+03 (0.006 seconds) Iteration 25: loglike=-1.008205e+03 (0.011 seconds) Iteration 26: loglike=-9.924005e+02 (0.011 seconds) Iteration 27: loglike=-9.920968e+02 (0.016 seconds) Iteration 28: loglike=-9.917380e+02 (0.018 seconds) Iteration 29: loglike=-9.912025e+02 (0.008 seconds) Iteration 30: loglike=-9.903257e+02 (0.009 seconds) Iteration 31: loglike=-9.892435e+02 (0.008 seconds) Iteration 32: loglike=-9.877160e+02 (0.007 seconds) Iteration 33: loglike=-9.859192e+02 (0.006 seconds) Iteration 34: loglike=-9.836574e+02 (0.009 seconds) Iteration 35: loglike=-9.814675e+02 (0.007 seconds) Iteration 36: loglike=-9.798386e+02 (0.006 seconds) Iteration 37: loglike=-9.793723e+02 (0.006 seconds) Iteration 38: loglike=-9.792857e+02 (0.006 seconds) Iteration 39: loglike=-9.792329e+02 (0.008 seconds) Iteration 40: loglike=-9.791737e+02 (0.006 seconds) Iteration 41: loglike=-9.791036e+02 (0.008 seconds) Iteration 42: loglike=-9.790199e+02 (0.007 seconds) Iteration 43: loglike=-9.789201e+02 (0.006 seconds) Iteration 44: loglike=-9.788011e+02 (0.006 seconds) Iteration 45: loglike=-9.786598e+02 (0.006 seconds) Iteration 46: loglike=-9.784926e+02 (0.007 seconds) Iteration 47: loglike=-9.782935e+02 (0.007 seconds) Iteration 48: loglike=-9.780498e+02 (0.007 seconds) Iteration 49: loglike=-9.777315e+02 (0.006 seconds) Iteration 50: loglike=-9.772782e+02 (0.005 seconds) Iteration 51: loglike=-9.766198e+02 (0.006 seconds) Iteration 52: loglike=-9.757762e+02 (0.010 seconds) Iteration 53: loglike=-9.748480e+02 (0.011 seconds) Iteration 54: loglike=-9.738679e+02 (0.011 seconds) Iteration 55: loglike=-9.728080e+02 (0.007 seconds) Iteration 56: loglike=-9.716088e+02 (0.006 seconds) Iteration 57: loglike=-9.701531e+02 (0.005 seconds) Iteration 58: loglike=-9.684097e+02 (0.006 seconds) Iteration 59: loglike=-9.668369e+02 (0.007 seconds) Iteration 60: loglike=-9.658912e+02 (0.006 seconds) Iteration 61: loglike=-9.654709e+02 (0.004 seconds) Iteration 62: loglike=-9.652997e+02 (0.000 seconds) Iteration 63: loglike=-9.652227e+02 (0.000 seconds) Iteration 64: loglike=-9.651824e+02 (0.023 seconds) Iteration 65: loglike=-9.651584e+02 (0.006 seconds) Iteration 66: loglike=-9.651426e+02 (0.008 seconds) Iteration 67: loglike=-9.651314e+02 (0.007 seconds) Iteration 68: loglike=-9.651230e+02 (0.007 seconds) Converged at iteration 68. .. code:: ipython3 import bayespy.plot as bpplt fig, ax = plt.subplots(1, 1, figsize=(10, 7)) bpplt.gaussian_mixture_2d(Y, alpha=alpha, scale=2, color="black", fill=True, axes=ax) .. image:: bayesian_with_python_19_0.png We get the result of the optimization: .. code:: ipython3 from bayespy.inference.vmp.nodes.gaussian import GaussianWishartMoments parent = Y.parents[1] (mu, _, sigma, _) = parent.get_moments() mu, sigma .. parsed-literal:: (array([[-1.36058471e-01, -5.54785591e-01], [ 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00], [ 7.32146904e+00, 4.25483005e+00], [ 0.00000000e+00, 0.00000000e+00], [ 8.75857358e-02, 9.11669540e-03], [ 3.72250304e+04, 1.40121275e+04], [ 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00], [-7.71374280e+00, -7.35395097e+00]]), array([[[ 6.52316512e-01, -1.63828746e-01], [-1.63828746e-01, 8.96234523e+00]], [[ 2.00000000e+05, 0.00000000e+00], [ 0.00000000e+00, 2.00000000e+05]], [[ 2.00000000e+05, 0.00000000e+00], [ 0.00000000e+00, 2.00000000e+05]], [[ 1.28998747e+00, 6.13731151e-01], [ 6.13731151e-01, 6.90986759e-01]], [[ 2.00000000e+05, 0.00000000e+00], [ 0.00000000e+00, 2.00000000e+05]], [[ 8.65297348e+00, 9.29871380e-03], [ 9.29871380e-03, 3.55806853e-01]], [[ 4.64596327e+03, 1.74893727e+03], [ 1.74893727e+03, 6.60113981e+02]], [[ 2.00000000e+05, 0.00000000e+00], [ 0.00000000e+00, 2.00000000e+05]], [[ 2.00000000e+05, 0.00000000e+00], [ 0.00000000e+00, 2.00000000e+05]], [[ 2.88152054e+00, 9.46475241e-01], [ 9.46475241e-01, 2.54402981e+00]]])) .. code:: ipython3 import numpy as np mu2 = np.linalg.solve(sigma, mu) mu2 .. parsed-literal:: array([[-0.22515766, -0.06601764], [ 0. , 0. ], [ 0. , 0. ], [ 4.75563304, 1.93368381], [ 0. , 0. ], [ 0.01009479, 0.02535878], [ 8.21785907, -0.54595493], [ 0. , 0. ], [ 0. , 0. ], [-1.96797779, -2.158508 ]]) The way you can build your model is quite nice but it still needs some development. scikit-learn proposes a better interface. .. raw:: html

scikit-learn .. raw:: html

We try to solve the same problem with another module: `scikit-learn `__. .. code:: ipython3 from sklearn import mixture gmm = mixture.GaussianMixture (n_components=10, covariance_type='full') gmm.fit(X) .. parsed-literal:: GaussianMixture(n_components=10) .. code:: ipython3 dpgmm = mixture.BayesianGaussianMixture(n_components=10, covariance_type='full') dpgmm.fit(X) .. parsed-literal:: BayesianGaussianMixture(n_components=10) .. code:: ipython3 import itertools import matplotlib as mpl color_iter = itertools.cycle(['r', 'g', 'b', 'c', 'm']) f, axarr = plt.subplots(1, 2, figsize=(14,7)) for i, (clf, title) in enumerate([(gmm, 'GMM'), (dpgmm, 'Dirichlet Process GMM')]): splot = axarr[i] Y_ = clf.predict(X) for i, (mean, covar, color) in enumerate(zip( clf.means_, clf.covariances_, color_iter)): v, w = np.linalg.eigh(covar) u = w[0] / np.linalg.norm(w[0]) # as the DP will not use every component it has access to # unless it needs it, we shouldn't plot the redundant # components. if not np.any(Y_ == i): continue splot.scatter(X[Y_ == i, 0], X[Y_ == i, 1], 0.8, color=color) # Plot an ellipse to show the Gaussian component angle = np.arctan(u[1] / u[0]) angle = 180 * angle / np.pi # convert to degrees ell = mpl.patches.Ellipse(mean, v[0], v[1], 180 + angle, color=color) ell.set_clip_box(splot.bbox) ell.set_alpha(0.5) splot.add_artist(ell) splot.set_title(title) .. image:: bayesian_with_python_28_0.png