# Voronoï et régression logistique¶

Le notebook Ã©tudie la pertinence d'un modÃ¨le de rÃ©gression logistique dans certaines configurations. Il regarde aussi le diagramme de VoronoÃ¯ associÃ© Ã  une rÃ©gression logistique Ã  trois classes. Il donne quelques intuitions sur les modÃ¨les que la rÃ©gression logistique peut rÃ©soudre.

In [1]:
from jyquickhelper import add_notebook_menu

In [2]:
%matplotlib inline


## Régression logistique¶

In [3]:
from sklearn.datasets import load_iris
X, y = data.data[:, :2], data.target

In [4]:
from sklearn.linear_model import LogisticRegression
clr = LogisticRegression()
clr.fit(X, y)

Out[4]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
In [5]:
clr.coef_

Out[5]:
array([[-2.49579289,  4.01011301],
[ 0.49709451, -1.63380222],
[ 1.15921404, -1.77736568]])
In [6]:
clr.intercept_

Out[6]:
array([ 0.81713932,  1.22543562, -2.22516119])
In [7]:
import numpy
x = numpy.array([[1, 2]])
clr.decision_function(x)

Out[7]:
array([[ 6.34157245, -1.54507432, -4.6206785 ]])
In [8]:
A = clr.coef_
B = clr.intercept_


On vÃ©rifie que la fonction de dÃ©cision correspond Ã  la formule suivant.

In [9]:
(A@x.T).T.ravel() + B

Out[9]:
array([ 6.34157245, -1.54507432, -4.6206785 ])
In [10]:
import matplotlib.pyplot as plt

def draw_border(clr, X, y, fct=None, incx=1, incy=1, figsize=None, border=True, ax=None):

# voir https://sashat.me/2017/01/11/list-of-20-simple-distinct-colors/
# https://matplotlib.org/examples/color/colormaps_reference.html
_unused_ = ["Red", "Green", "Yellow", "Blue", "Orange", "Purple", "Cyan",
"Magenta", "Lime", "Pink", "Teal", "Lavender", "Brown", "Beige",
"Maroon", "Mint", "Olive", "Coral", "Navy", "Grey", "White", "Black"]

h = .02  # step size in the mesh
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - incx, X[:, 0].max() + incx
y_min, y_max = X[:, 1].min() - incy, X[:, 1].max() + incy
xx, yy = numpy.meshgrid(numpy.arange(x_min, x_max, h), numpy.arange(y_min, y_max, h))
if fct is None:
Z = clr.predict(numpy.c_[xx.ravel(), yy.ravel()])
else:
Z = fct(clr, numpy.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
cmap = plt.cm.tab20
Z = Z.reshape(xx.shape)
if ax is None:
fig, ax = plt.subplots(1, 1, figsize=figsize or (4, 3))
ax.pcolormesh(xx, yy, Z, cmap=cmap)

# Plot also the training points
ax.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=cmap)
ax.set_xlabel('Sepal length')
ax.set_ylabel('Sepal width')

ax.set_xlim(xx.min(), xx.max())
ax.set_ylim(yy.min(), yy.max())

# Draw lines
x1, x2 = xx.min(), xx.max()
cl = 0
if border:
for i in range(0, clr.coef_.shape[0]):
for j in range(i+1, clr.coef_.shape[0]):
delta = clr.coef_[i] - clr.coef_[j]
db = clr.intercept_[i] - clr.intercept_[j]
y1 = (-db - delta[0] * x1) / delta[1]
y2 = (-db - delta[0] * x2) / delta[1]
ax.plot([x1, x2], [y1, y2], '--', color="white")
cl += 1
else:
for i in range(0, clr.coef_.shape[0]):
delta = clr.coef_[i]
db = clr.intercept_[i]
y1 = (-db - delta[0] * x1) / delta[1]
y2 = (-db - delta[0] * x2) / delta[1]
ax.plot([x1, x2], [y1, y2], '--', color="yellow")
cl += 1

return ax

fig, ax = plt.subplots(1, 2, figsize=(10,4))
draw_border(clr, X, y, ax=ax[0])
draw_border(clr, X, y, border=False, ax=ax[1])
ax[0].set_title("FrontiÃ¨re entre 2 classes")
ax[1].set_title("FrontiÃ¨re entre 1 classe et les autres");


## Quelques diagramme de Voronoï¶

In [11]:
points = numpy.array([[1, 2], [3, 4], [4, 1]])

In [12]:
from scipy.spatial import Voronoi, voronoi_plot_2d
vor = Voronoi(points)

In [13]:
fig, ax = plt.subplots(figsize=(4,4))
ax.ishold = lambda: True  # bug between scipy and matplotlib 3.0
voronoi_plot_2d(vor, ax=ax)
ax.set_xlim([0, 5])
ax.set_ylim([0, 5])
ax.axis('off');

In [14]:
vor.point_region

Out[14]:
array([3, 1, 2], dtype=int64)
In [15]:
vor.vertices

Out[15]:
array([[2.75, 2.25]])
In [16]:
from matplotlib.patches import Circle
from matplotlib.collections import PatchCollection
points = numpy.array([[1, 1], [2, 4], [4, 1], [6,3]])
vor = Voronoi(points)
fig, ax = plt.subplots(figsize=(4,4))
cs = []
for i in range(vor.vertices.shape[0]):
v = vor.vertices[i, :]
d = (v - points[2, :])
r = (d.dot(d) ** 0.5)
circle = Circle((v[0], v[1]), r, fill=False, ls='--', edgecolor='g', visible=True)
for i in range(points.shape[0]):
for j in range(i+1, points.shape[0]):
if i == 0 and j == 3:
continue
ax.plot(points[[i, j], 0], points[[i, j], 1], "g-")
ax.ishold = lambda: True  # bug between scipy and matplotlib 3.0
voronoi_plot_2d(vor, ax=ax)
ax.set_xlim([0, 7])
ax.set_ylim([0, 7])
ax.axis('off');

In [17]:
import math
n = 5
a = math.pi * 2 / 3
points = []
for i in range(n):
for j in range(n):
points.append([i + j * math.cos(a), j * math.sin(a)])
points = numpy.array(points)

In [18]:
vor = Voronoi(points)

In [19]:
fig, ax = plt.subplots(figsize=(4,4))
ax.ishold = lambda: True  # bug between scipy and matplotlib 3.0
voronoi_plot_2d(vor, ax=ax)
ax.set_xlim([-1.5, 4])
ax.set_ylim([-1.5, 4])
ax.axis('off');


## Un diagramme de Voronoï proche¶

$$\begin{array}{ll} &\left\{\begin{array}{l}\left<L_i - L_j, P_i \right> + B_i - B_j = - \left\{ \left<L_i - L_j, P_j \right> + B_i - B_j \right \} \\ P_i- P_j - \left<P_i - P_j, \frac{L_i-L_j}{\Vert L_i-L_j\Vert} \right> \frac{L_i-L_j}{\Vert L_i-L_j\Vert }=0 \end{array} \right. \\ \Longleftrightarrow & \left\{\begin{array}{l}\left<L_i - L_j, P_i + P_j\right> + 2 (B_i - B_j) = 0 \\ P_i- P_j - \left<P_i - P_j, \frac{L_i-L_j}{\Vert L_i-L_j \Vert} \right> \frac{L_i-L_j}{\Vert L_i-L_j\Vert}=0 \end{array} \right. \\ \Longrightarrow & \left\{\begin{array}{l} \left<L_i - L_j, P_i + P_j \right> + 2 (B_i - B_j) = 0 \\ \left<P_i- P_j, u \right> - \left<P_i - P_j, \frac{L_i-L_j}{\Vert L_i-L_j\Vert} \right> \left<\frac{L_i-L_j}{\Vert L_i-L_j\Vert},u \right>=0 \end{array} \right. \end{array}$$

OÃ¹ $u$ est un vecteur unitÃ© quelconque. On cherche Ã  rÃ©soudre sous la forme d'un systÃ¨me linÃ©aire $LP=B$ oÃ¹ le vecteur $P$ est l'ensemble des coordonnÃ©es de tous les points cherchÃ©s. D'aprÃ¨s la page citÃ© ci-dessus, dans le cas d'un diagramme Ã  trois classes, ce systÃ¨me a une infinitÃ© de solutions.

In [20]:
import numpy
matL = []
matB = []
L = clr.coef_
B = clr.intercept_
for i in range(0, L.shape[0]):
for j in range(i + 1, L.shape[0]):
li = L[i, :]
lj = L[j, :]
c = (li - lj)
nc = (c.T @ c) ** 0.5

# condition 1
mat = numpy.zeros((L.shape))
mat[i,:] = c
mat[j,:] = c
d = -2*(B[i] - B[j])
matB.append(d)
matL.append(mat.ravel())

# condition 2 - cache plusieurs Ã©quations
# on ne prend que la premiÃ¨re coordonnÃ©e
c /= nc
c2 = c * c[0]
mat = numpy.zeros((L.shape))
mat[i,:] = -c2
mat[j,:] = c2

mat[i,0] += 1
mat[j,0] -= 1
matB.append(0)
matL.append(mat.ravel())

matL = numpy.array(matL)
matB = numpy.array(matB)
matL.shape, matB.shape, numpy.linalg.det(matL)

Out[20]:
((6, 6), (6,), 2.0281820935727704e-16)
In [21]:
import pandas
pandas.DataFrame(matL)

Out[21]:
0 1 2 3 4 5
0 -2.992887 5.643915 -2.992887 5.643915 0.000000 0.000000
1 0.780516 0.413897 -0.780516 -0.413897 0.000000 0.000000
2 -3.655007 5.787479 0.000000 0.000000 -3.655007 5.787479
3 0.714879 0.451472 0.000000 0.000000 -0.714879 -0.451472
4 0.000000 0.000000 -0.662120 0.143563 -0.662120 0.143563
5 0.000000 0.000000 0.044902 0.207088 -0.044902 -0.207088

Le dÃ©terminant est trÃ¨s faible suggÃ©rant que la matrice est non inversible et on sait qu'elle l'est dans ce cas. On remplace la derniÃ¨re Ã©quation en forÃ§ant la coordonnÃ©e d'un point.

In [22]:
matL[-1,:] = 0
matL[-1,0] = 1
matB[-1] = 3
numpy.linalg.det(matL)

Out[22]:
42.07770646874508

In [23]:
import pandas
df = pandas.DataFrame(matL)
df['B'] = matB
df

Out[23]:
0 1 2 3 4 5 B
0 -2.992887 5.643915 -2.992887 5.643915 0.000000 0.000000 0.816593
1 0.780516 0.413897 -0.780516 -0.413897 0.000000 0.000000 0.000000
2 -3.655007 5.787479 0.000000 0.000000 -3.655007 5.787479 -6.084601
3 0.714879 0.451472 0.000000 0.000000 -0.714879 -0.451472 0.000000
4 0.000000 0.000000 -0.662120 0.143563 -0.662120 0.143563 -6.901194
5 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.000000
In [24]:
from numpy.linalg import inv
points = (inv(matL) @ matB).reshape((3,2))
points

Out[24]:
array([[3.        , 4.12377262],
[5.03684606, 0.2827372 ],
[5.48745959, 0.18503334]])
In [25]:
x = points[0, :]
c1 = (L@x.T).T.ravel() + B
x = points[1, :]
c2 = (L@x.T).T.ravel() + B
x = points[2, :]
c3 = (L@x.T).T.ravel() + B
numpy.vstack([c1,c2,c3])

Out[25]:
array([[  9.86655487,  -4.02070972,  -6.07697098],
[-10.61997713,   3.26728747,   3.1110941 ],
[-12.13641872,   3.65091377,   3.80710713]])
In [26]:
ax = draw_border(clr, X, y, incx=2, incy=2)
ax.plot(points[:, 0], points[:, 1], 'ro');


## Régression logistique dans un quadrillage¶

In [27]:
Xs = []
Ys = []
n = 20
for i in range(0, 4):
for j in range(0, 3):
x1 = numpy.random.rand(n) + i*1.1
x2 = numpy.random.rand(n) + j*1.1
Xs.append(numpy.vstack([x1,x2]).T)
Ys.extend([i*3+j] * n)
X = numpy.vstack(Xs)
Y = numpy.array(Ys)
X.shape, Y.shape

Out[27]:
((240, 2), (240,))
In [28]:
set(Y)

Out[28]:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}

In [29]:
fig, ax = plt.subplots(1, 1, figsize=(6,4))
for i in range(0, 12):
ax.plot(X[Y==i,0], X[Y==i,1], 'o', label="cl%d"%i, color=plt.cm.tab20.colors[i])
ax.legend()
ax.set_title("Classification Ã  neuf classes\ndans un quadrillage");

In [30]:
from sklearn.linear_model import LogisticRegression
clr = LogisticRegression()
clr.fit(X, Y)

Out[30]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
In [31]:
ax = draw_border(clr, X, Y, incx=1, incy=1, figsize=(12,8), border=False)

In [32]:
clr.score(X, Y)

Out[32]:
0.6958333333333333

On copie les features en les mettant au carrÃ©. Le problÃ¨me est toujours aussi simple mais la rÃ©gression logistique a plus de variables non corrÃ©lÃ©es sur lesquelles s'appuyer.

In [33]:
def create_feat(X):
X2 = X.copy()
X2[:, 0] = X2[:, 0] * X2[:, 0]
X2[:, 1] = X2[:, 1] * X2[:, 1]
XX2 = numpy.hstack([X, X2])
return XX2

clr2 = LogisticRegression()
clr2.fit(create_feat(X), Y)

Out[33]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
In [34]:
def fct_predict(clr, X):
return clr.predict(create_feat(X))

ax = draw_border(clr2, X, Y, fct=fct_predict, incx=1, incy=1, figsize=(12,8), border=False)

In [35]:
clr2.score(create_feat(X), Y)

Out[35]:
0.9583333333333334

Du fait que ce problÃ¨me de classification est Ã©quivalent Ã  un diagramme de VoronoÃ¯, il a Ã©tÃ© construit comme tel, le fait que la rÃ©gression logistique semble Ãªtre provenir d'un problÃ¨me de convergence numÃ©rique plutÃ´t que du modÃ¨le thÃ©orique. Pour vÃ©rfier on joue avec les paramÃ¨tres d'apprentissage. Tout d'abord, l'algorithme de descente de gradient.

In [36]:
clr_t = LogisticRegression(solver='lbfgs')
clr_t.fit(X, Y)
clr_t.score(X, Y)

Out[36]:
0.9
In [37]:
ax = draw_border(clr_t, X, Y, incx=1, incy=1, figsize=(6,4), border=False)


Ensuite, on change la faÃ§on de rÃ©soudre le problÃ¨me. PlutÃ´t que de rÃ©soudre n problÃ¨mes de classifications binaires, on rÃ©soud un seul problÃ¨me avec une erreur de classification Ã©gale Ã  la Multinomial logistic regression.

In [38]:
clr_t = LogisticRegression(solver='lbfgs', multi_class='multinomial')
clr_t.fit(X, Y)
clr_t.score(X, Y)

Out[38]:
0.9875
In [39]:
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
draw_border(clr_t, X, Y, incx=1, incy=1, figsize=(6,4), border=False, ax=ax[0])
draw_border(clr_t, X, Y, incx=1, incy=1, figsize=(6,4), border=True, ax=ax[1])