XD blog

blog page

~technical


2015-05-19 Continuous integration

I'm using Travis to check that my open source modules works on Linux. I discovered the following video Olivier Grisel - Build and test wheel packages on Linux, OSX & Windows - PyCon 2015 which explains how this is done for scikit-learn. AppVeyor offers the same service as Travis but for Windows (it works with Azure). The setup is more complex as Python is the C++ compiler is still a pain to configure. You can check how it is done for scikit-learn: windows_testing_downloader.ps1, appveyor.yml. A few other interesting scripts: mingw.py.

2015-05-18 Ordonner les onglets sous Excel

On me demandait récemment comment ordonner les onglets sous Excel. Mon premier réflexe de chercher sur un moteur de recherche ordonner les onglets sous Excel. Je recopie ici le code trouver sur le site de Microsoft : Comment trier les onglets d'un classeur

Pour s'en servir, il suffit de recopier ce code dans l'éditeur VBA qui apparaît dès qu'on utilise la combinaison ALT+F11. Il faut le recopier dans la fenêtre associée à l'intitulé ThisWorkbook.


more...

2015-05-17 Notebook to slides

Notebooks can be converted into slides with nbconvert. It produces a slideshow with reveal.js if metadata are added to some cells. It indicates when a new slide of subslide should be started.

The trick is if you have many notebooks to convert, it is unlikely you are going to open all of them to tag each slide and subslide. So I implemented this very basic function add_tag_slide which automatically tags the cells. You can see an example of the outcome here. I also recommend to copy/paste reveal.js locally. It works better.

2015-05-10 Notebook on Github

Github now renders notebooks and do not show the raw text by default: Features ou modèle. It is still a little bit slow but why did I spend some time to convert them? I wonder.

2015-05-06 Probability puzzles

J'ai découvert le terme de probability puzzle. Ils résonnent comme les mots clés qui permettent de trouver de nombreux devinettes de probabilités.

2015-05-05 c3.js

c3.js is a framework based on d3.js. Simple graphs can be obtained with shorter scripts compared to d3.js. If you are curious, you can look for b3.js, a3.js (for 3D graphs). Then you an go on with d4.js...

2015-04-27 IoT, Internet of things and Python

Internet of Things is a way to say that every object can be connected to each others and to you. We will be soon able to command many daily objects (fridge, table, bed, ...) from anywhere with a smartphone. They will be even able to take simple decisions for you such as ordering food which is missing from your fridge or started to warm your flat just because your smartphone is heading to your home. And it will be doable in Python: The WiPy: The Internet of Things Taken to the Next Level.

2015-04-24 Use javascript tools in your notebook

This is a reason why I do like notebooks. It is more than a static page. Many tools in javascript can be added to the page. The first example is about visualizing the differences between two files. I looks like that:

It is available with pyensae, more details here at A magic command to visualize differences between two files in a notebook.

The second example is about using Scratch from a notebook with Snap!. The code may be slow but the result is quite nice: Scratch dans un notebook.

2015-04-23 Open data and bias

The article 3 Cities Using Open Data in Creative Ways to Solve Problems shows three different ways to play with data and to build interesting information at a city level. Based on that, it becomes easier to improve the life of people leaving in that city. If this data is available to people taking decisions, they can take action to fix some the issues reported on the maps and they can measure the impact after it is fixed. However, if everybody knows this data, they would probably start to change their behaviour and the data will start reflecting that change. The first issue could artificially disappear without being fixed.

That's what explains the second article Randomized experimentation. By learning from the data, machine learned models end up proposing better options to people and they both forget others options are still possible.

2015-04-18 A few modules for Sphinx

Here are a few modules I will probably use when I need them to generate documentation with Sphinx. This is just to avoid searching for them again.

And some others modules to easily build graphs with javascript and python:

2015-04-11 PyPy.js, Python command line inside a WebPage

The conderence PyCon 2015 is just happening. You can find Ryan Kelly who implemented PyPy.js which is an implementation of the Python interpretor in javascript. That way, it is possible to add a python command line windows in a webpage, access the page elements by using python syntax... The following video gives some insights Ryan Kelly: PyPy.js: What? How? Why? about what it can do. And if you prefer something written: PyPy.js: Now faster than CPython .

2015-04-08 Easy website with Javascript

Many websites are based on a similar template today: one long page with many sections to scroll down. You will find example in this article: 42 top examples of JavaScript. It seems a long way to get something similar on your own. But with some web searches, it seems reasonable to find some javascript tools which can speed up the creating of the webiste. See The top 5 JavaScript templating engines for a blog, 10 Moteurs de templates pour Javascript et Nodejs, 25 free, scrolling plugins for awesome experiences. The code for the basic example seems quite short.

2015-04-07 Text and machine learning

How can we automatically translate text into another language? You can find some intuitive explanations of how that works from the video by Peter Norvig included in the blog post Being good at programming competitions correlates negatively with being good on the job. It introduces to machine learning, bag of words, Statistical Machine Translation. The conference is quite easy to follow and gives insight on how much data these system require.

2015-04-06 Blog generator

I publish my teaching material as python module. I added some tricks to made that happen. Recently, I was wondering how to add some kind of blog posts inside the documentation. As I have several teaching going on, I did not want to merge all blog posts into a single one where students would have to filter out what blog post is meant for them. So I thought about using a kind of blog generator written in Python on the top of Sphinx. I went through that blog post What's the best available static blog/website generator in Python? which gives a short list of them. It is possible to check their popularity by looking at Top Open-Source Static Site Generators.

I wanted to follow the same design for my blog, same pattern. So I was looking for a tool generated RST files and not directly HTML. Tinkerer seemed a good choice. Should I have to add the message powered by Tinkerer as every site using it is displaying that sentance? I also looked into Pelican, nykola. I also found a very simple one with a French name: éClaircie.

I finally decided to write some code to process my own blog posts and to insert them in the documentation of a an existing python module. It is not finalized yet but it looks like that: An example of a blog post included in the documentation. This process forces me to dig into sphinx devext API which I do not fully understand yet. It is quite difficult to find good examples on the web. What I have implemented is available here: pyquickhelper.helpgen.

By implementing my own blog, I cannot have all the features the static generators have (good templating, many languages). I spent most of my time in implementing the blog post aggregations (categories, months) and the splitting (not more than 10 blog posts per pages). But now, if I want to customize Sphinx a little bit, it is easier.

2015-03-31 GitHub, mais pourquoi ?

GitHub, c'est quoi ? En langage technique, on appelle ça un logiciel de suivi de source ou logiciel de gestion de version. On s'en sert dès qu'on travail sur des fichiers et à plusieurs. Il permet de garder la trace de toutes les modifications. L'article de Rue89 en dit un peu plus à ce sujet : Qu'est-ce que tous les techos du monde font sur GitHub ?. Aujourd'hui, on n'imagine plus s'en passer. D'ailleurs tous mes enseignements y sont : github xavier.

Même si l'outil a été développé pour développer du code informatique, il peut servir pour suivre les modifications de n'importe quel texte y compris le code civil et les images. Ca marche un peu moins bien voire souvent pas du tout pour tous les formats complexes, surtout s'ils sont propriétaires.

GitHub est gratuit pour tous les projets publics. Il faut payer si on ne veut pas exposer ses sources au public. On peut aussi aller chez le concurrent BitBucket dont les conditions tarifaires sont différentes. Si on ne souhaite pas du tout que ses sources soient hébergées par une compagnie tierce, on peut installer un serveur GitLab chez soi. Et si on souhaite juste suivre ses modifications sur son ordinateur en local, on peut installer juste Git, avec TortoiseGit.

Si vous êtes courageux, vous pouvez aller jusqu'à regarder les outils d'intégration continue tels que Travis CI ou GitLab CI.


<-- -->

Xavier Dupré