XD blog

blog page


2014-11-27 Some really annoying things with Hadoop

When you look for a bug, it could fail anywhere so you make assumptions on what you think is working and what could be wrong. I lost of couples of hours because I made the wrong one. I was preparing my teachings and I stored my data using Python serialization instead of json. That way, I could just use eval( string_serialized ). It already worked locally and on a first cluster hadoop when this instruction was embedded in python script. A PIG script was then calling it through streaming. I then tried the same instruction and it failed many times until I check this line was involved. And why? The error message was just not here to tell me anything about what happened. The script was just crashing. I suspect the line was just too long So, I'll do that another way. My first assumption were the schema I used in my Jython script. I finally chose to save that issue by considering only strings. I commented out line after line until this one out finally made my job work. That's the part of computing science I don't like. Long, full of guesses, impossible to accomplish without an internet connexion to dig into the vast amount of recent and outdated examples. And maybe in a couple of months, this issue will be solved by a simple update.

This night never happened. That's what I'll keep in mind. This night never happened.

2014-11-23 Install cvxopt on Ubuntu

The module cvxopt is not part of Anaconda distribution. If you try to run from a terminal:

pip install cvxopt

It is probably because Lapack and Blas are not installed. If it is the case, I suggest to follow the instructions in How to install Blas and Lapack. After it is done, the following instruction should run:

pip install cvxopt

On Windows, I suggest to go to Unofficial Windows Binaries for Python Extension Packages.

2014-11-16 Form in notebooks (IPython)

Sometimes I need to type some credentials to access a remote machine from my notebook. I could write them in a cell but then I would need to remove them from the notebook to avoid sharing them by negligence. I was using a simple fonction showing a window tkinter, a pop up. But this solution only works if the notebook server is local. When it is remote, the pop up windows appears on the remote machine and I cannot see it.

IPython allows javascript functions to execute some Python instructions which impact the workspace. All I had to do was to print a form in the page with some javascript. The result can be found here : Having a form in a notebook.

2014-11-14 Documentation for the module Azure in Python

I did find any documentation for the module azure-sdk-for-python so I generated it with Sphinx: azure-sdk-for-python (documentation).

2014-11-13 1981 : apprendre la programmation à l'école ?

En farfouillant sur Internet, je suis tombé sur un texte de Andreï Erschov : La programmation est un deuxième alphabétisme. C'est un exposé de 1981 sur l'utilisation des ordinateurs dans l'enseignement. C'est en Russe mais j'ai copié collé une traduction française faite par un moteur de traduction automatique et un peu retravaillée avec la traduction anglaise de meilleure qualité. J'espère que je n'ai pas déformé les propos de l'auteur, ce que je suis incapable de vérifier. Le texte a plus de trente ans mais il exprime déjà l'intuition que l'ordinateur et plus particulièrement la programmation vont considérablement changer la société, presque autant que l'écriture. Il est important que cette dernière fasse partie de l'enseignement afin de mieux préparer les enfants.


2014-11-12 Notebook sur iPad

Après avoir installé un serveur de notebook sur une machine distante (voir Remote Notebook with Azure), j'ai voulu essayer depuis un iPad. Cela ne fonctionne pas à cause d'une erreur de WebSocket. Ca ne marche pas mieux avec Chrome (la solution proposée dans cette page ne marche pas). Il exite néanmoins l'application Computable pour iPad. Elle est gratuite pour essayer et payant pour créer ses propres notebooks. Elle utilise Python 2.7.

2014-11-11 SQLite in a Notebook with Magic Commands

Whenever I need to use SQLite, most of the time, I don't use Python because I don't do it often enough to remember the syntax. I usually use SQLiteSpy. However, converting any result to a dataframe is impossible unless I copy paste the results somewhere. So I implemented some magic commands in pyensae you can see in the notebook SQL Magic Commands with SQLite in a Notebook.

2014-11-10 Un article sur les valeurs extrêmes dans des réseaux sociaux

Predictability of Extreme Events in Social Media

2014-11-09 Remote Notebook with Azure

For my teachings, I installed a notebook server on a virtual machine on Azure. All the students will be able to connect the same login (the multi-user configuration is part of the roadmap). The students will not have to install the notebooks by themselves. They will be able to see what other students users do. Here are the steps I followed.

Step 1: create the virtual machine with Azure

I won't detail that, it is pretty straight forward. Just follow the tutorial Create a Virtual Machine Running Windows. I chose a Windows Server. The number of cores must depend on the number of users. I assume all students are not going to access the notebook at the same time except during the lectures. I chose eight cores. I might modify this post in case it is not enough.

Step 2: install Python (latest version - 3.4 today)


Xavier Dupré