{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Tech - calcul matriciel avec numpy\n", "\n", "[numpy](https://numpy.org/) est la librairie incontournable pour faire des calculs en Python. Ces fonctionnalit\u00e9s sont disponibles dans tous les langages et utilisent les optimisations processeurs. Il est hautement improbable d'\u00e9crire un code aussi rapide sans l'utiliser.\n", "\n", "[numpy](https://numpy.org/) impl\u00e9mente ce qu'on appelle les op\u00e9rations matricielles basiques ou plus commun\u00e9ment appel\u00e9es [BLAS](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms). Quelque soit le langage, l'impl\u00e9mentation est r\u00e9alis\u00e9e en langage bas niveau (C, fortran, assembleur) et a \u00e9t\u00e9 peaufin\u00e9e depuis 50 ans au gr\u00e9 des am\u00e9liorations mat\u00e9rielles."]}, {"cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [{"data": {"text/html": ["
run previous cell, wait for 2 seconds
\n", ""], "text/plain": [""]}, "execution_count": 2, "metadata": {}, "output_type": "execute_result"}], "source": ["from jyquickhelper import add_notebook_menu\n", "add_notebook_menu()"]}, {"cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": ["%matplotlib inline"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Enonc\u00e9\n", "\n", "La librairie [numpy](https://numpy.org/) propose principalement deux types : [array](https://numpy.org/doc/stable/reference/generated/numpy.array.html) et [matrix](https://numpy.org/doc/stable/reference/generated/numpy.matrix.html). Pour faire simple, prenez toujours le premier. Ca \u00e9vite les erreurs. Les [array](https://numpy.org/doc/stable/reference/generated/numpy.array.html) sont des tableaux \u00e0 plusieurs dimensions."]}, {"cell_type": "markdown", "metadata": {}, "source": ["### La ma\u00eetrise du slice\n", "\n", "Le slice est l'op\u00e9rateur ``:`` (d\u00e9crit sur la page [indexing](https://numpy.org/doc/stable/reference/arrays.indexing.html)). Il permet de r\u00e9cup\u00e9rer une ligne, une colonne, un intervalle de valeurs."]}, {"cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0, 5, 6, -3],\n", " [ 6, 7, -4, 8],\n", " [-5, 8, -4, 9]])"]}, "execution_count": 4, "metadata": {}, "output_type": "execute_result"}], "source": ["import numpy\n", "\n", "mat = numpy.array([[0, 5, 6, -3],\n", " [6, 7, -4, 8],\n", " [-5, 8, -4, 9]])\n", "mat"]}, {"cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [{"data": {"text/plain": ["(array([[ 0, 5, 6, -3],\n", " [ 6, 7, -4, 8]]),\n", " array([[ 0, 5],\n", " [ 6, 7],\n", " [-5, 8]]),\n", " -3,\n", " array([[0, 5],\n", " [6, 7]]))"]}, "execution_count": 5, "metadata": {}, "output_type": "execute_result"}], "source": ["mat[:2], mat[:, :2], mat[0, 3], mat[0:2, 0:2]"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### La ma\u00eetrise du nan\n", "\n", "[nan](https://numpy.org/doc/stable/reference/constants.html) est une convention pour d\u00e9signer une valeur manquante. Elle r\u00e9agit de fa\u00e7on un peut particuli\u00e8re. Elle n'est \u00e9gale \u00e0 aucune autre y compris elle-m\u00eame."]}, {"cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [{"data": {"text/plain": ["False"]}, "execution_count": 6, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.nan == numpy.nan"]}, {"cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [{"data": {"text/plain": ["False"]}, "execution_count": 7, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.nan == 4"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Il faut donc utiliser une fonction sp\u00e9ciale [isnan](https://numpy.org/doc/stable/reference/generated/numpy.isnan.html)."]}, {"cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [{"data": {"text/plain": ["True"]}, "execution_count": 8, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.isnan(numpy.nan)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### La ma\u00eetrise des types\n", "\n", "Un tableau est d\u00e9fini par ses dimensions et le type unique des \u00e9l\u00e9ments qu'il contient."]}, {"cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [{"data": {"text/plain": ["((3,), dtype('int32'))"]}, "execution_count": 9, "metadata": {}, "output_type": "execute_result"}], "source": ["matint = numpy.array([0, 1, 2])\n", "matint.shape, matint.dtype"]}, {"cell_type": "markdown", "metadata": {}, "source": ["C'est le m\u00eame type pour toute la matrice. Il existe plusieurs type d'entiers et des r\u00e9els pour des questions de performances."]}, {"cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["508 ns \u00b1 44.7 ns per loop (mean \u00b1 std. dev. of 7 runs, 1000000 loops each)\n"]}], "source": ["%timeit matint * matint"]}, {"cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [{"data": {"text/plain": ["((3,), dtype('float64'))"]}, "execution_count": 11, "metadata": {}, "output_type": "execute_result"}], "source": ["matintf = matint.astype(numpy.float64)\n", "matintf.shape, matintf.dtype"]}, {"cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["484 ns \u00b1 30.4 ns per loop (mean \u00b1 std. dev. of 7 runs, 1000000 loops each)\n"]}], "source": ["%timeit matintf * matintf"]}, {"cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["894 ns \u00b1 78.8 ns per loop (mean \u00b1 std. dev. of 7 runs, 1000000 loops each)\n"]}], "source": ["%timeit matintf * matint"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Un changement de type et le calcul est plus long."]}, {"cell_type": "markdown", "metadata": {}, "source": ["### La ma\u00eetrise du broadcasting\n", "\n", "Le [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) signifie que certaines op\u00e9rations ont un sens m\u00eame si les dimensions des tableaux ne sont pas tout \u00e0 fait \u00e9gales."]}, {"cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0, 5, 6, -3],\n", " [ 6, 7, -4, 8],\n", " [-5, 8, -4, 9]])"]}, "execution_count": 14, "metadata": {}, "output_type": "execute_result"}], "source": ["mat"]}, {"cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[1000, 1005, 1006, 997],\n", " [1006, 1007, 996, 1008],\n", " [ 995, 1008, 996, 1009]])"]}, "execution_count": 15, "metadata": {}, "output_type": "execute_result"}], "source": ["mat + 1000"]}, {"cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0, 15, 106, 997],\n", " [ 6, 17, 96, 1008],\n", " [ -5, 18, 96, 1009]])"]}, "execution_count": 16, "metadata": {}, "output_type": "execute_result"}], "source": ["mat + numpy.array([0, 10, 100, 1000])"]}, {"cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0, 5, 6, -3],\n", " [ 16, 17, 6, 18],\n", " [ 95, 108, 96, 109]])"]}, "execution_count": 17, "metadata": {}, "output_type": "execute_result"}], "source": ["mat + numpy.array([[0, 10, 100]]).T"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### La ma\u00eetrise des index"]}, {"cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0, 5, 6, -3],\n", " [ 6, 7, -4, 8],\n", " [-5, 8, -4, 9]])"]}, "execution_count": 18, "metadata": {}, "output_type": "execute_result"}], "source": ["mat = numpy.array([[0, 5, 6, -3],\n", " [6, 7, -4, 8],\n", " [-5, 8, -4, 9]])\n", "mat"]}, {"cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[False, True, False, False],\n", " [False, False, False, False],\n", " [False, False, False, False]])"]}, "execution_count": 19, "metadata": {}, "output_type": "execute_result"}], "source": ["mat == 5"]}, {"cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ True, False, False, False],\n", " [False, False, True, False],\n", " [False, False, False, True]])"]}, "execution_count": 20, "metadata": {}, "output_type": "execute_result"}], "source": ["mat == numpy.array([[0, -4, 9]]).T"]}, {"cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[1, 0, 0, 0],\n", " [0, 0, 1, 0],\n", " [0, 0, 0, 1]], dtype=int64)"]}, "execution_count": 21, "metadata": {}, "output_type": "execute_result"}], "source": ["(mat == numpy.array([[0, -4, 9]]).T).astype(numpy.int64)"]}, {"cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0, 0, 0, 0],\n", " [ 0, 0, -4, 0],\n", " [ 0, 0, 0, 9]], dtype=int64)"]}, "execution_count": 22, "metadata": {}, "output_type": "execute_result"}], "source": ["mat * (mat == numpy.array([[0, -4, 9]]).T).astype(numpy.int64)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### La ma\u00eetrise des fonctions\n", "\n", "On peut regrouper les op\u00e9rations que [numpy](https://numpy.org/) propose en diff\u00e9rents th\u00e8mes. Mais avant il \n", "\n", "* L'**initialisation** : [array](https://numpy.org/doc/stable/reference/generated/numpy.array.html), [empty](https://numpy.org/doc/stable/reference/generated/numpy.empty.html), [zeros](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html), [ones](https://numpy.org/doc/stable/reference/generated/numpy.ones.html), [full](https://numpy.org/doc/stable/reference/generated/numpy.full.html), [identity](https://numpy.org/doc/stable/reference/generated/numpy.identity.html), [rand](https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html), [randn](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randn.html), [randint](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html)\n", "* Les **op\u00e9rations basiques** : `+`, `-`, `*`, `/`, `@`, [dot](https://numpy.org/doc/stable/reference/generated/numpy.dot.html)\n", "* Les **transformations** : [transpose](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html), [hstack](https://numpy.org/doc/stable/reference/generated/numpy.hstack.html), [vstack](https://numpy.org/doc/stable/reference/generated/numpy.vstack.html), [reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html), [squeeze](https://numpy.org/doc/stable/reference/generated/numpy.squeeze.html), [expend_dims](https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html)\n", "* Les **op\u00e9rations de r\u00e9duction** : [minimum](https://numpy.org/doc/stable/reference/generated/numpy.minimum.html), [maximum](https://numpy.org/doc/stable/reference/generated/numpy.maximum.html), [argmin](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html), [argmax](https://numpy.org/doc/stable/reference/generated/numpy.argmax.html), [sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html), [mean](https://numpy.org/doc/stable/reference/generated/numpy.mean.html), [prod](https://numpy.org/doc/stable/reference/generated/numpy.prod.html), [var](https://numpy.org/doc/stable/reference/generated/numpy.var.html), [std](https://numpy.org/doc/stable/reference/generated/numpy.std.html)\n", "* Tout le reste comme la g\u00e9n\u00e9ration de matrices al\u00e9atoires, le calcul des valeurs, vecteurs propres, des fonctions commme [take](https://numpy.org/doc/stable/reference/generated/numpy.take.html), ..."]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q1 : calculer la valeur du $\\chi_2$ d'un tableau de contingence\n", "\n", "La formule est [l\u00e0](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Testing_for_statistical_independence). Et il faut le faire sans boucle. Vous pouvez comparer avec la fonction [chisquare](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html) de la librairie [scipy](https://www.scipy.org/) qui est une extension de [numpy](https://numpy.org/).\n", "\n", "$$\\chi_2 = N \\sum_{i,j} p_{i.} p_{.j} \\left( \\frac{\\frac{O_{ij}}{N} - p_{i.} p_{.j}}{p_{i.} p_{.j}}\\right)^2$$"]}, {"cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": []}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q2 : calculer une distribution un peu particuli\u00e8re\n", "\n", "La fonction [histogram](https://numpy.org/doc/stable/reference/generated/numpy.histogram.html) permet de calculer la distribution empirique de variables. Pour cette question, on tire un vecteur al\u00e9atoire de taille 10 avec la fonction [rand](https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html), on les trie par ordre croissant, on recommence plein de fois, on calcule la distribution du plus grand nombre, du second plus grand nombre, ..., du plus petit nombre."]}, {"cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": []}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q3 : on veut cr\u00e9er une matrice identit\u00e9 un million par un million\n", "\n", "Vous pouvez essayer sans r\u00e9fl\u00e9chir ou lire cette page d'abord : [csr_matrix](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html)."]}, {"cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": []}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q4 : vous devez cr\u00e9er l'application StopCovid\n", "\n", "Il existe une machine qui re\u00e7oit la position de 3 millions de t\u00e9l\u00e9phones portable. On veut identifier les cas contacts (rapidement)."]}, {"cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": []}, {"cell_type": "markdown", "metadata": {}, "source": ["## R\u00e9ponses"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q1 : calculer la valeur du $\\chi_2$ d'un tableau de contingence\n", "\n", "La formule est [l\u00e0](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Testing_for_statistical_independence). Et il faut le faire sans boucle. Vous pouvez comparer avec la fonction [chisquare](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html) de la librairie [scipy](https://www.scipy.org/) qui est une extension de [numpy](https://numpy.org/).\n", "\n", "$$\\chi_2 = N \\sum_{i,j} p_{i.} p_{.j} \\left( \\frac{\\frac{O_{ij}}{N} - p_{i.} p_{.j}}{p_{i.} p_{.j}}\\right)^2$$"]}, {"cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[15., 20., 13.],\n", " [ 4., 9., 5.]])"]}, "execution_count": 27, "metadata": {}, "output_type": "execute_result"}], "source": ["import numpy\n", "O = numpy.array([[15., 20., 13.], [4., 9., 5.]])\n", "O"]}, {"cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.5798254016266716"]}, "execution_count": 28, "metadata": {}, "output_type": "execute_result"}], "source": ["def chi_square(O):\n", " N = numpy.sum(O)\n", " pis = numpy.sum(O, axis=1, keepdims=True) / N\n", " pjs = numpy.sum(O, axis=0, keepdims=True) / N\n", " pispjs = pis @ pjs\n", " chi = pispjs * ((O / N - pispjs) / pispjs) ** 2\n", " return numpy.sum(chi) * N\n", "\n", "chi_square(O)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q2 : calculer une distribution un peu particuli\u00e8re\n", "\n", "La fonction [histogram](https://numpy.org/doc/stable/reference/generated/numpy.histogram.html) permet de calculer la distribution empirique de variables. Pour cette question, on tire un vecteur al\u00e9atoire de taille 10 avec la fonction [rand](https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html), on les trie par ordre croissant, on recommence plein de fois, on calcule la distribution du plus grand nombre, du second plus grand nombre, ..., du plus petit nombre."]}, {"cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([0.98556467, 0.47377301, 0.77148185, 0.26135908, 0.27373018,\n", " 0.0240458 , 0.55360714, 0.3575369 , 0.71740732, 0.3260206 ])"]}, "execution_count": 29, "metadata": {}, "output_type": "execute_result"}], "source": ["rnd = numpy.random.rand(10)\n", "rnd"]}, {"cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([0.0240458 , 0.26135908, 0.27373018, 0.3260206 , 0.3575369 ,\n", " 0.47377301, 0.55360714, 0.71740732, 0.77148185, 0.98556467])"]}, "execution_count": 30, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.sort(rnd)"]}, {"cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.876020129318981"]}, "execution_count": 31, "metadata": {}, "output_type": "execute_result"}], "source": ["def tirage(n):\n", " rnd = numpy.random.rand(n)\n", " trie = numpy.sort(rnd)\n", " return trie[-1]\n", "\n", "tirage(10)"]}, {"cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([0.99594032, 0.67914189, 0.98105965, 0.93181536, 0.86827764])"]}, "execution_count": 32, "metadata": {}, "output_type": "execute_result"}], "source": ["def plusieurs_tirages(N, n):\n", " rnd = numpy.random.rand(N, n)\n", " return numpy.max(rnd, axis=1)\n", "\n", "plusieurs_tirages(5, 10)"]}, {"cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [{"data": {"text/plain": ["(array([ 5, 8, 20, 35, 111, 221, 407, 785, 1273, 2135],\n", " dtype=int64),\n", " array([0.437878 , 0.49408914, 0.55030028, 0.60651142, 0.66272256,\n", " 0.7189337 , 0.77514485, 0.83135599, 0.88756713, 0.94377827,\n", " 0.99998941]))"]}, "execution_count": 33, "metadata": {}, "output_type": "execute_result"}], "source": ["t = plusieurs_tirages(5000, 10)\n", "hist = numpy.histogram(t)\n", "hist"]}, {"cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [{"data": {"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAeKElEQVR4nO3deXxU9b3/8dcnk42EJSwhQBYCCgoiCIRNW7V1uagtet0RFSzVtrfW3lvbX+2j/dnWPu5t673VW1t769qCirjc6oO29KeAa8sum0AgLCYkLCFAQoCQkGS+vz8y6IgJmZCZOZmZ9/Px4JGZOcc57y9J3h6+58w55pxDRERiX5LXAUREJDxU6CIicUKFLiISJ1ToIiJxQoUuIhInkr3acL9+/VxhYaFXmxcRiUkffPDBAedcdmvLPCv0wsJCVq9e7dXmRURikpmVtbVMUy4iInFChS4iEidU6CIicUKFLiISJ1ToIiJxQoUuIhInVOgiInFChS4iEiXOOf79r5vZvKc2Iu+vQhcRiZK/bz/AU+9/xJZ9KnQRkZg2Z2kZfTNTuWb0wIi8vwpdRCQKyg/VsWRLJdMnFpCW7IvINlToIiJR8PyKMpLMuG1SQcS2oUIXEYmw+sZmXlpVzpUjcxiU1S1i21Ghi4hE2IJ1e6ipa2TmhYUR3Y4KXUQkgpxz/HFpKefk9GDSkD4R3ZYKXUQkgtbsqmbz3lruvHAwZhbRbanQRUQiaM7SMnqkJ3PdBbkR35YKXUQkQvbX1rPww73cND6fzLTI3yAupEI3s6lmttXMtpvZA6dZ7wYzc2ZWFL6IIiKxad7KXTT5HXdOGRyV7bVb6GbmAx4HrgJGAtPNbGQr6/UAvg2sCHdIEZFYc6LJzwsrdnHpOdkU9suMyjZD2UOfCGx3zu10zp0A5gPXtrLez4BfAvVhzCciEpPe2LSPqiMNzJxSGLVthlLouUB50POKwGsfM7NxQL5z7q+neyMzu8fMVpvZ6qqqqg6HFRGJFXOWllLQJ4NLhmdHbZudPihqZknAI8D97a3rnHvSOVfknCvKzo7eIEVEomnj7sOsLqvmzimDSUqK7KmKwUIp9N1AftDzvMBrJ/UARgHvmFkpMBlYoAOjIpKonltWRrcUHzeNz29/5TAKpdBXAcPMbIiZpQK3AgtOLnTOHXbO9XPOFTrnCoHlwDTn3OqIJBYR6cJq6k7w+rrdXDc2l14ZKVHddruF7pxrAu4F3gCKgZedc5vM7CEzmxbpgCIiseTl1eU0NPmjdqpisJDOdHfOLQQWnvLag22se2nnY4mIxJ5mv2PusjImDunDiIE9o759fVJURCRM3t6yn4rq48yK8FUV26JCFxEJkznLShnQM50rRuZ4sn0VuohIGOyoOsr72w4wY1IBKT5vqlWFLiISBs8tKyPFZ9w6MXK3mGuPCl1EpJOONjTx6gcVXHP+QLJ7pHmWQ4UuItJJr62p4GhDU8RvMdceFbqISCc455izrIzReb24ID/L0ywqdBGRTli24yDb9x/lzimFEb/FXHtU6CIinfDHpaX0yUzlS6MHeh1FhS4icqYqqutYXFzJrRPySU/xeR1HhS4icqZeWLELgBmTo3/dltao0EVEzkB9YzPzV+7iipE55GZ18zoOoEIXETkjf16/h+q6xqjeYq49KnQRkQ5qOVWxlGH9uzPlrL5ex/mYCl1EpIPWltewcXctd17o/amKwVToIiIdNHdpKT3Skrl+bK7XUT5FhS4i0gFVRxr464d7uWF8HplpId0jKGpU6CIiHfDiyl00NjtPbjHXHhW6iEiIGpv9vLCijIuHZzM0u7vXcT5DhS4iEqI3N1VSWdvAzC64dw4qdBGRkM1ZVkp+n25cek5/r6O0SoUuIhKC4r21rPzoEHdMHowvqeucqhhMhS4iEoK5y0pJT0ni5qJ8r6O0SYUuItKOw3WNvLZ2N9ddkEtWRqrXcdqkQhcRaccrH5RT3+jnji56MPQkFbqIyGn4/Y65y8qYUNib8wb18jrOaanQRURO452S/ew6VMedXeiqim1RoYuInMacpWX075HG1FEDvI7SLhW6iEgbPjpwjHdLqpgxaTApvq5fl10/oYiIR55bVkaKz5g+qeueqhhMhS4i0opjDU28srqcq0YNpH+PdK/jhESFLiLSitfW7uZIQxMzL+zapyoGU6GLiJzCOcfcZaWMyu3JuILeXscJmQpdROQUy3ceoqTyKHdO6Vq3mGuPCl1E5BRzl5WSlZHCtDGDvI7SISp0EZEge2qO8+bmSm6ZkE96is/rOB2iQhcRCfLCijKcc9w+KXYOhp4UUqGb2VQz22pm283sgVaWf93MPjSzdWb2dzMbGf6oIiKRVd/YzIsry7lsRA75fTK8jtNh7Ra6mfmAx4GrgJHA9FYKe55z7nzn3AXAw8Aj4Q4qIhJpCz/cy6FjJ5gZA9dtaU0oe+gTge3OuZ3OuRPAfODa4BWcc7VBTzMBF76IIiLRMWdpKUOzM7no7L5eRzkjoRR6LlAe9Lwi8NqnmNk3zWwHLXvo97X2RmZ2j5mtNrPVVVVVZ5JXRCQi1pXXsL7iMDNj7FTFYGE7KOqce9w5dxbwfeBHbazzpHOuyDlXlJ2dHa5Ni4h02tylpXRPS+aG8XleRzljoRT6biD4yjR5gdfaMh+4rhOZRESi6sDRBv6yYS83jMule1qy13HOWCiFvgoYZmZDzCwVuBVYELyCmQ0LenoNsC18EUVEIuulVeWcaPZzR4weDD2p3f8VOeeazOxe4A3ABzzrnNtkZg8Bq51zC4B7zexyoBGoBmZGMrSISLg0Nft5fnkZnzu7H2f37+51nE4J6d8WzrmFwMJTXnsw6PG3w5xLRCQqFm2uZO/hen467Tyvo3SaPikqIgltzrJScrO6cdmIHK+jdJoKXUQS1tZ9R1i+8xB3TBmMLyk2T1UMpkIXkYQ1Z1kpaclJ3FIUG7eYa48KXUQS0uHjjby2ZjfTxgyid2aq13HCQoUuIgnp1Q8qON7YzMwLC72OEjYqdBFJOH6/47llpYwf3JtRub28jhM2KnQRSTjvbqui9GAdd06JvWuen44KXUQSztylpfTrnsZVowZ6HSWsVOgiklBKDxzjnZIqbptUQGpyfFVgfI1GRKQdzy8vw2fGjEkFXkcJOxW6iCSMuhNNvLy6nKmjBpDTM93rOGGnQheRhPH62j3U1jfF1amKwVToIpIQnHPMXVbKiIE9KRrc2+s4EaFCF5GEsPKjQ2zZd4RZFw6O2VvMtUeFLiIJYc6yUnp1S2HamM/cEjluqNBFJO4t3lzJwg/3MWNSAd1SfV7HiRgVuojEtd01x7n/lfWMHNiT+y4b1v5/EMNU6CIStxqb/Xxr3hqamv08PmMc6Snxu3cOId6CTkQkFv3Xm1tZs6uGx6aPZUi/TK/jRJz20EUkLr29ZT9PvLuT6RMLmDZmkNdxokKFLiJxZ+/h43zn5XWcO6AHP/7ySK/jRI0KXUTiSlOzn/teXEtDU2LMmwfTHLqIxJVHFpWwqrSa/77lAs7K7u51nKjSHrqIxI13S6r43Ts7uKUon+vGxu8HiNqiQheRuFBZW893XlrHOTk9+Mm087yO4wkVuojEvJPz5nUnmnl8xti4/jTo6WgOXURi3mNLtrHio0P86qYxnN2/h9dxPKM9dBGJaX/fdoDfvL2dG8fnccP4PK/jeEqFLiIxa39tPf/60lrOzu7OQ9cm5rx5ME25iEhMavY7vj1/HUcbmph392QyUlVn+hsQkZj0m7e2sWznQR6+YTTDcxJ33jyYplxEJOYs3XGAXy/ZxvVjc7mpKLHnzYOp0EUkplQdaeDb89cxtF8mP7tuVNzeTu5MaMpFRGKG3+/4zsvrqD3eyHOzJ5KZpgoLpr8NEYkZv3tnO+9vO8DPrz+fcwf09DpOl6MpFxGJCct3HuSRRSVMGzOIWyfkex2nS1Khi0iXd/BoA9+ev5bBfTP5j+vP17x5G0IqdDObamZbzWy7mT3QyvLvmNlmM9tgZkvMbHD4o4pIIvL7Hf/28nqq6xr57W1j6a558za1W+hm5gMeB64CRgLTzezUW4CsBYqcc6OBV4GHwx1URBLT79/bwXslVTz4pZGcN6iX13G6tFD20CcC251zO51zJ4D5wLXBKzjn3nbO1QWeLgd0YqiIdNqq0kP86s0Srhk9kBmTCryO0+WFUui5QHnQ84rAa22ZDfytM6FERKqPneC+F9eS17sbv9C8eUjCOhllZrcDRcAlbSy/B7gHoKBA/7cVkdb5/Y77X1nPwaMn+NO/XEiP9BSvI8WEUPbQdwPB5wjlBV77FDO7HPghMM0519DaGznnnnTOFTnnirKzs88kr4gkgKfe38lbW/bzoy+NYFSu5s1DFUqhrwKGmdkQM0sFbgUWBK9gZmOBJ2gp8/3hjykiieKDsmoefmMrV58/gDsm64S5jmi30J1zTcC9wBtAMfCyc26TmT1kZtMCq/0n0B14xczWmdmCNt5ORKRNNXUt8+aDstL5xQ2jNW/eQSHNoTvnFgILT3ntwaDHl4c5l4gkGOcc331lA/uP1PO/37iQnpo37zB9UlREuoRn/v4Ri4sr+cFVIxidl+V1nJikQhcRz60rr+GX/28LV47M4a6LCr2OE7NU6CLiqcN1jXzzhTXk9EznP28co3nzTtBFEUTEM845vvfqeipr63nl61PolaF5887QHrqIeOaPS0t5c3MlD1x1LmMLensdJ+ap0EXEExsqaviPhcVcPqI/sz83xOs4cUGFLiJRV1vfyL3z1pLdPY3/uknz5uGiOXQRiSrnHA/87wb21Bznpa9NISsj1etIcUN76CISVc8vL2Phh/v43j+dw/jBmjcPJxW6iETNxt2H+dlfivnCOdnc/fmhXseJOyp0EYmKI/WNfHPeGvpkpvKrmy8gKUnz5uGmOXQRiTjnHD/404dUVB9n/j2T6ZOpefNI0B66iETcvJW7+MuGvdx/5XAmFPbxOk7cUqGLSERt3lPLT/+8mUuGZ/P1i8/yOk5cU6GLSMRUVNdx99zV9M5I4ZGbx2jePMJU6CISEZW19cx4egVH6ht5ZuYE+nZP8zpS3NNBUREJuwNHG7jtqeUcPHqC52ZP1H1Bo0R76CISVjV1J7j96RXsrjnOs7Mm6KJbUaQ9dBEJm9r6Ru54ZiU7Dxzj2ZkTmDhEZ7REk/bQRSQsjjU0cdcfVrFlXy2/v30cnxvWz+tICUd76CLSafWNzcyes4p15TU8fttYvnhujteREpL20EWkUxqamrnnuQ9Y8dEhHrl5DFNHDfQ6UsJSoYvIGWts9nPvvLW8V1LFL68fzbUX5HodKaGp0EXkjDQ1+/nXl9axaHMlD117HjdPyPc6UsJToYtIh/n9jv/z6gb+umEvP7x6BHdOKfQ6kqBCF5EOcs7xw9c38qe1u7n/iuHcfbGua95VqNBFJGTOOR76y2ZeXLmLb37hLL512TCvI0kQFbqIhMQ5x8NvbOUP/yjlKxcN4btXnuN1JDmFCl1EQvLYku38zzs7mDGpgP/7pRGY6cqJXY0KXUTa9cS7O3h0cQk3js/jZ9eOUpl3USp0ETmtOUtL+fnftvDlMYP45Q2jdU3zLkyFLiJtmr9yFz9esIkrR+bwyM1j8KnMuzQVuoi06rW1FfzgtQ+5ZHg2v7ltLCk+1UVXp++QiHzGwg/3cv/L65kytC9P3DGetGSf15EkBCp0EfmUJcWV3PfiWsYV9ObpmUWkp6jMY4UKXUQ+9v62Kr7x/BrOG9STP9w1gYxUXWE7lqjQRQSA5TsPcvfc1ZzVvztzvjKRHukpXkeSDlKhiwgflFUz+4+ryOudwfOzJ5KVkep1JDkDIRW6mU01s61mtt3MHmhl+cVmtsbMmszsxvDHFJFI2bj7MLP+sJLsHmnM++ok+nZP8zqSnKF2C93MfMDjwFXASGC6mY08ZbVdwCxgXrgDikjkbN13hNufWUHP9BReuHsy/Xumex1JOiGUIx4Tge3OuZ0AZjYfuBbYfHIF51xpYJk/AhlFJAJ2VB1lxtPLSU/28eLdk8nN6uZ1JOmkUKZccoHyoOcVgdc6zMzuMbPVZra6qqrqTN5CRMJg18E6Zjy1AoAX7p5EQd8MjxNJOET1oKhz7knnXJFzrig7OzuamxaRgN01x5n+1HLqm5p5/quTOCu7u9eRJExCKfTdQPDNAvMCr4lIjKmsrWfGU8uprW/k+dmTOHdAT68jSRiFUuirgGFmNsTMUoFbgQWRjSUi4XbwaAMznl5B1ZEG5nxlIqNye3kdScKs3UJ3zjUB9wJvAMXAy865TWb2kJlNAzCzCWZWAdwEPGFmmyIZWkQ6pqbuBLc/s5KK6jqenTWBcQW9vY4kERDS53qdcwuBhae89mDQ41W0TMWISBdTW9/IzGdXsmP/UZ6ZVcSkoX29jiQRok+KisSxYw1NfOUPq9i0p5bfzRjH54fpZIR4pkIXiVP1jc3cPXc1a3ZV89j0sVw+MsfrSBJhupSaSJxp9jteX7ubx97axq5DdTx68wVcff5Ar2NJFKjQReKE3+/4y4d7+e/FJeysOsZ5g3oy566JXDxc0yyJQoUuEuP8fscbm/bx6OISSiqPck5OD35/+zj+6bwBmOkeoIlEhS4So5xzLC7ez6OLSti8t5azsjP5zfSxXHP+QJJ0M+eEpEIXiTHOOd4pqeLRRSVsqDhMYd8MHr1lDNPG5OJTkSc0FbpIjHDOsXTHQX715lbW7Kohr3c3Hr5hNNePyyXZpxPWRIUuEhNW7DzIrxaVsPKjQwzslc6///MobhqfT2qyilw+oUIX6cI+KKvm0UUl/H37Afr3SOOn087jlgn5pKf4vI4mXZAKXaQL2lBRwyOLSnhnaxX9uqfyo2tGcPvkwSpyOS0VukgXsmnPYR5dtI3FxZVkZaTw/annMvPCwWSk6ldV2qefEpEuoKTyCI8uKuFvG/fRMz2Z+68YzqyLCumRnuJ1NIkhKnQRD+2oOsqvF2/jzxv2kJmazH1fPJvZnx9Kr24qcuk4FbqIB8oOHuPXS7bx+trdpCX7+PolZ3HP54fSOzPV62gSw1ToIlFUUV3Hb5Zs59U1FSQnGbM/N4SvXXIW/bqneR1N4oAKXSQK9h4+zuNvb+elVeUYxh2TB/Mvl55F/57pXkeTOKJCF4mg/bX1/O6dHcxbuQvnHDcX5XPvF89mYK9uXkeTOKRCF4mAg0cb+P27O3hueRmNzY4bx+Vx7xfPJr9PhtfRJI6p0EXC5HBdI++U7GdJ8X4WF1dS39jMdRfkct9lwyjsl+l1PEkAKnSRTig9cIzFxZUsLq5kVWk1zX5H38xUvjx6EHdfPJSz+3f3OqIkEBW6SAc0+x1rdlWzeHNLie+oOgbA8JzufO3ioVw2IocL8rN0GVvxhApdpB1H6ht5f9sBFm+u5O2t+6muayQ5yZg8tC+3Tx7M5SNyNDcuXYIKXaQV5YfqWFJcyZIt+1m+8yCNzY6sjBS+cE5/LhvRn4uHZ9NTH8uXLkaFLkLLfTnXV9SwuLiSJcX72bLvCABDszO566IhXD4ih3EFWbqRhHRpKnRJWHUnmnh/2wGWFFfy1pYqDhxtwJdkFA3uzQ+vHsFlI/ozNFsHNSV2qNAloew9fJwlxftZUlzJP3Yc5ESTnx7pyVwyPJsrRuZwyfBssjJ0PRWJTSp0iWvOOTburm2ZStlSycbdtQAU9Mng9kmDuXxEfyYM6UOKplIkDqjQJe7UNzazdMcBFhfv563i/eyrrccMxhf05vtTz+XyEf05u393zHRqocQXFbrEvNr6RjZWHGZdRQ1ryqr5x/aDHG9sJjPVx8XDs7lsRA5fOCebvrqiocQ5FbrElPrGZor31rK+vIYNgRLfGfhwD0Bh3wxuKsrjshE5TB7ah7Rk3YNTEocKXbqsZr9j+/6jrC+vYX1Fy58te4/Q5HcAZPdIY0xeFtePzWV0Xhaj83rpgKYkNBW6dAnOOcoPHWd9RQ0bKmpYX36YjXsOU3eiGYAe6cmMzuvFPRcPZXReFmPyezGgZ7rmwUWCqNDFE1VHGlqKu+JwYPqkhuq6RgBSk5M4b1BPbi7KZ0x+L0bnZTGkbyZJuj6KyGmp0CXijtQ38uHuw2z4uLwPs7vmOABJBsNzenDlyAGMzu/FmLwszhnQQ6cRipwBFbqEVUNTM8V7j7ChooZ1gfLeUXUU1zLtTUGfDMYWZHHXRYWMzstiVG5PMlL1YygSDvpNko8552ho8nP8RDN1jc0cPxH409hM3YmmoMfN1Ae+Hg+sd7ShiZLKIxTvraWxuaW9+3VPY0xeL6aNGcTovJapkz66q71IxIRU6GY2Ffg14AOeds794pTlacBcYDxwELjFOVca3qiJrbHZT0OTn/rGZhqa/DQ0NlPf6Keh6ZNS/UzZBhXuJ683tb1+Y/PHe9KhSvEZ3VJ8dEv1MbRfd2Z/bihj8noxJj+Lgb100FIkmtotdDPzAY8DVwAVwCozW+Cc2xy02myg2jl3tpndCvwSuCUSgSPBOYffgd85/M7hAo+b/S2vt7W8qdl9UrBNnxTsp74Glre6rKn5MyX9qedBj5v9HWzagPSUJLql+MhITSY9JYmM1GS6pfjIykhlYIqPjFQf6ak+MgKl3C3VF1jfR7fAuhmpPtJPvhb033RL8WmuW6QLCWUPfSKw3Tm3E8DM5gPXAsGFfi3wk8DjV4Hfmpk519H9vfa9vKqcJ97b8XGpnixaF1S4ftdyOdRQl0dLis9IS/aRlpxEekrL17SUk8+TyMpI/dSyT9ZJIj3Z1/L11GXJLWXbLfWTwj35OD3ZpzNDRBJIKIWeC5QHPa8AJrW1jnOuycwOA32BA8Ermdk9wD0ABQUFZxS4d2Yq5w7oSVKSkWSQZIYFvn7yvOWxL8lOuzzJ7FPvk2QEln36vU++z6nLfUn2mYJNT2kp3pai/eRrqi9J19IWkYiK6kFR59yTwJMARUVFZ7RvfMXIHK4YmRPWXCIi8SCUXcbdQH7Q87zAa62uY2bJQC9aDo6KiEiUhFLoq4BhZjbEzFKBW4EFp6yzAJgZeHwj8FYk5s9FRKRt7U65BObE7wXeoOW0xWedc5vM7CFgtXNuAfAM8JyZbQcO0VL6IiISRSHNoTvnFgILT3ntwaDH9cBN4Y0mIiIdodMuRETihApdRCROqNBFROKECl1EJE6YV2cXmlkVUBamt+vHKZ9KjXMab3xLtPFC4o25M+Md7JzLbm2BZ4UeTma22jlX5HWOaNF441uijRcSb8yRGq+mXERE4oQKXUQkTsRLoT/pdYAo03jjW6KNFxJvzBEZb1zMoYuISPzsoYuIJDwVuohInIipQjezqWa21cy2m9kDrSyfZWZVZrYu8OerXuQMl/bGG1jnZjPbbGabzGxetDOGUwjf30eDvrclZlbjQcywCWG8BWb2tpmtNbMNZna1FznDJYTxDjazJYGxvmNmeV7kDBcze9bM9pvZxjaWm5k9Fvj72GBm4zq9UedcTPyh5dK9O4ChQCqwHhh5yjqzgN96nTWK4x0GrAV6B5739zp3JMd7yvrfouVSzp5nj+D390ngG4HHI4FSr3NHeLyvADMDj78IPOd17k6O+WJgHLCxjeVXA38DDJgMrOjsNmNpD/3jm1U7504AJ29WHa9CGe/dwOPOuWoA59z+KGcMp45+f6cDL0YlWWSEMl4H9Aw87gXsiWK+cAtlvCOBtwKP325leUxxzr1Hy/0h2nItMNe1WA5kmdnAzmwzlgq9tZtV57ay3g2Bf768amb5rSyPFaGMdzgw3Mz+YWbLzWxq1NKFX6jfX8xsMDCET375Y1Eo4/0JcLuZVdByP4JvRSdaRIQy3vXA9YHH/wz0MLO+UcjmlZB/5kMVS4Ueij8Dhc650cAiYI7HeSItmZZpl0tp2WN9ysyyvAwUJbcCrzrnmr0OEmHTgT865/Jo+ef5c2YWb7+zwb4LXGJma4FLaLlXcbx/j8Mqln442r1ZtXPuoHOuIfD0aWB8lLJFQig3564AFjjnGp1zHwEltBR8LAplvCfdSmxPt0Bo450NvAzgnFsGpNNyUadYFMrv7x7n3PXOubHADwOv1UQtYfR15Gc+JLFU6O3erPqU+adpQHEU84VbKDfnfp2WvXPMrB8tUzA7o5gxnEIZL2Z2LtAbWBblfOEWynh3AZcBmNkIWgq9KqopwyeU399+Qf8C+QHwbJQzRtsC4M7A2S6TgcPOub2decOQ7inaFbjQblZ9n5lNA5poORgxy7PAnRTieN8ArjSzzbT80/R7zrmD3qU+cyGOF1qKYL4LnCYQq0Ic7/20TKP9Gy0HSGfF6rhDHO+lwM/NzAHvAd/0LHAYmNmLtIypX+A4yI+BFADn3O9pOS5yNbAdqAPu6vQ2Y/TnQ0REThFLUy4iInIaKnQRkTihQhcRiRMqdBGROKFCFxGJEyp0EZE4oUIXEYkT/x8Q4v78wXRVyAAAAABJRU5ErkJggg==\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["import matplotlib.pyplot as plt\n", "plt.plot(hist[1][1:], hist[0] / hist[0].sum());"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q3 : on veut cr\u00e9er une matrice identit\u00e9 un million par un million\n", "\n", "Vous pouvez essayer sans r\u00e9fl\u00e9chir ou lire cette page d'abord : [csr_matrix](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html)."]}, {"cell_type": "markdown", "metadata": {}, "source": ["$(10^6)^2=10^{12}$>10 Go, bref \u00e7a ne tient pas en m\u00e9moire sauf si on a une grosse machine. Les matrices creuses (ou sparses en anglais), sont ad\u00e9quates pour repr\u00e9senter des matrices dont la grande majorit\u00e9 des coefficients sont nuls car ceux-ci ne sont pas stock\u00e9s. Concr\u00e8tement, la matrice enregistre uniquement les coordonn\u00e9es des coefficients et les valeurs non nuls."]}, {"cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["c:\\python395_x64\\lib\\site-packages\\scipy\\sparse\\_index.py:125: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.\n", " self._set_arrayXarray(i, j, x)\n"]}], "source": ["import numpy\n", "from scipy.sparse import csr_matrix\n", "ide = csr_matrix((1000000, 1000000), dtype=numpy.float64)\n", "ide.setdiag(1.)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Q4 : vous devez cr\u00e9er l'application StopCovid\n", "\n", "Il existe une machine qui re\u00e7oit la position de 3 millions de t\u00e9l\u00e9phones portable. On veut identifier les cas contacts (rapidement)."]}, {"cell_type": "markdown", "metadata": {}, "source": ["Si on devait calculer toutes les paires de distance, cela prendrait un temps fou. Il faut ruser. Le plus simple est de construire une grille sur le territoire fran\u00e7ais puis d'associer \u00e0 chaque t\u00e9l\u00e9phone portable la grille dans laquelle il se trouve. Dans une cellule de la grille, le nombre de paires est beaucoup plus r\u00e9duit. Ce n'est pas la seule astuce qu'il faudra utiliser. Mais c'est un bon d\u00e9but."]}, {"cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": []}], "metadata": {"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5"}}, "nbformat": 4, "nbformat_minor": 2}