XD blog

blog page

python


2015-07-17 Short history of scientific computing with Python

This is a more the story of the speaker, Wes McKinney but mixed with Python. Wes McKinney founded DataPad recently acquired by Cloudera. You can find it on this blog: Scientific computing in Python. The story begins en 2007.


more...

2015-07-13 Using Antlr4 to write a grammar

Here are a few tricks I discovered when I implemented my first grammar using Antlr 4. First, it really helps to be able to test some parts of your grammar. One of the options is the plugin IntelliJ Idea Plugin for ANTLR v4 for the following tool: IntelliJ IDEA (the plugin works with the community edition, see also Installing, Updating and Uninstalling Repository Plugins). You will find many grammar examples at antlr/grammars-v4. The tool tells you if the grammar compiles and when it cannot parse an example as it displays a graphs with the recognized pieces.

source: https://github.com/antlr/intellij-plugin-v4
more...

2015-06-18 Website to find interesting Python modules

2015-06-07 Custom Directive on Sphinx

I recently discovered a nice way to integrate plots in sphinx documentation with the custom directive bokeh-plot. I thought it would be quite easy to create mine to add a simple blogging system. However the documentation is pretty rare on that topic. All my searches ended up at Tutorial: Writing a simple extension. So here are my finding about creating a custom directive BlogPostDirective to process something like:

.. blogpost::
    :title: Migration to IPython 3.1
    :keywords: ipython, migration, jupyter, jenkins, pandoc
    :date: 2015-04-16
    :categories: ipython, documentation
    
    Any text this blog could contains and any RST tag::
    
        ...

more...

2015-06-03 Le chapeau du Petit Prince

L'année scolaire se termine et je suis encore surpris d'être arrivé au bout. J'ai changé tous mes cours, utilisé les notebooks que je ne connaissais pas il y a un an, préparé deux autres cours qui ont quasiment doublé mes heures de présence à l'ENSAE. J'ai passé des heures, souvent nocturnes, à chercher comment lancer facilement des jobs sur des clusters depuis des notebooks. Je regarde un peu effaré le nombre de téléchargements (19.000 durant le dernier mois) d'un des modules que j'ai commencé voici un an et demi pour automatiser la mise à jour de mes cours, la conversion des notebooks en page HTML, en slides, en PDF, la récupération des mails envoyés par les élèves.

Je me suis beaucoup amusé. J'ai aussi été agréablement surpris de voir les élèves inonder les clusters de jobs, se lancer dans des projets de machine learning avec plaisir, se montrer curieux, apprendre la programmation et l'apprécier. Et le dire !

Je me souviens d'un jour où je me suis retrouvé à La Maison des Contes et des Histoires. Une conteuse et des enfants de trois à dix ans, ils étaient tous conquis au bout de vingt minutes. Je n'imaginais pas que je vivrais quelque chose de similaire cette année, des enfants de huit à quatorze ans, tous regroupés autour d'une histoire de données. Et j'ai créé un autre site lesenfantscodaient.fr pour dire que les histoires d'algorithmes ressemblent aussi beaucoup à ça :

Un grand merci à tous ceux qui m'ont permis de faire tout ça.

2015-05-24 Install pip and setuptools

pip was recently updated. It cannot be updated using pip install -U pip on Windows because it has to replace itself. So read the documentation. You can:

Or you can just download the package from pipy, unzip and install them with python setup.py install.

2015-05-23 Jenkins, TortoiseGit and locked files on Windows

Some errors can be very annoying sometimes if they come back again and again. Here is one of them: TortoiseGit locks repository folders so that the user can't delete them which I throught was caused by Jenkins. So annoying that it was mentioned in a couple of issues 401, 497, 1880. I could avoid using TortoiseGit but that's the only git GUI I don't have to remember the usage.

So let's tweak the TortoiseGit's Settings even if some changes seem to fix it:

TGitCache now checks file sizes before checking file contents. 
This should mitigate possible "file is locked" problems.
TGitCache now does not check the contents of files with filesize > 10 MiB any more
and falls back to checking the timestamp of the files (as if TGitCacheCheckContent
is set to "false") according the the git index. This limit can be changed by adjusting
TGitCacheCheckContentMaxSize (measured in KiB) in TortoiseGit advanced settings.
The reason for this change is that libgit2 reads a file to memory for hashing and,
thus, locking the file and the repository for this time span.

A couple of tricks:

2015-05-21 Github Awards

I discovered this website Github Awards which ranks people implementing or contributing to open source projects. You can even drill down per language, city... And you find some popular script you did not even know about: ipycache or TimeSide (audio processing) or Facebook Python SDK.

2015-05-19 Continuous integration

I'm using Travis to check that my open source modules works on Linux. I discovered the following video Olivier Grisel - Build and test wheel packages on Linux, OSX & Windows - PyCon 2015 which explains how this is done for scikit-learn. AppVeyor offers the same service as Travis but for Windows (it works with Azure). The setup is more complex as Python is the C++ compiler is still a pain to configure. You can check how it is done for scikit-learn: windows_testing_downloader.ps1, appveyor.yml. A few other interesting scripts: mingw.py.

2015-05-17 Notebook to slides

Notebooks can be converted into slides with nbconvert. It produces a slideshow with reveal.js if metadata are added to some cells. It indicates when a new slide of subslide should be started.

The trick is if you have many notebooks to convert, it is unlikely you are going to open all of them to tag each slide and subslide. So I implemented this very basic function add_tag_slide which automatically tags the cells. You can see an example of the outcome here. I also recommend to copy/paste reveal.js locally. It works better.

2015-05-10 Notebook on Github

Github now renders notebooks and do not show the raw text by default: Features ou modèle. It is still a little bit slow but why did I spend some time to convert them? I wonder.

2015-04-27 IoT, Internet of things and Python

Internet of Things is a way to say that every object can be connected to each others and to you. We will be soon able to command many daily objects (fridge, table, bed, ...) from anywhere with a smartphone. They will be even able to take simple decisions for you such as ordering food which is missing from your fridge or started to warm your flat just because your smartphone is heading to your home. And it will be doable in Python: The WiPy: The Internet of Things Taken to the Next Level.

2015-04-24 Use javascript tools in your notebook

This is a reason why I do like notebooks. It is more than a static page. Many tools in javascript can be added to the page. The first example is about visualizing the differences between two files. I looks like that:

It is available with pyensae, more details here at A magic command to visualize differences between two files in a notebook.

The second example is about using Scratch from a notebook with Snap!. The code may be slow but the result is quite nice: Scratch dans un notebook.

2015-04-18 A few modules for Sphinx

Here are a few modules I will probably use when I need them to generate documentation with Sphinx. This is just to avoid searching for them again.

And some others modules to easily build graphs with javascript and python:

2015-04-11 PyPy.js, Python command line inside a WebPage

The conderence PyCon 2015 is just happening. You can find Ryan Kelly who implemented PyPy.js which is an implementation of the Python interpretor in javascript. That way, it is possible to add a python command line windows in a webpage, access the page elements by using python syntax... The following video gives some insights Ryan Kelly: PyPy.js: What? How? Why? about what it can do. And if you prefer something written: PyPy.js: Now faster than CPython .


<-- -->

Xavier Dupré