XD blog

blog page

excel, pivot, python, tableformula


2013-03-04 Easy Pivot

I realize that wherever I go, I always do the same operation with figures. Before using any machine learning, the most basic task consists in aggregate figures using a pivot. Excel is already providing a visual tool. No matter what I do, I always find some limitations or I feel stuck because I do not know this function or this other one. Translating a matrix in Excel is still a pain. Plus, depending on the machine, I'm using, I need to switch from English function to French function which is really annoying. 10 years ago, I used to work with Lotus Improve. Basically, it was a cube in n dimension and you could write formulas quite easily. The sheet displayed on the screen was a pivot by itself and we could move a dimension from one axis to the others or choose a value and bring another dimension for the view. The concept was interesting. I never met anything similar except a programming language.

Any scripting language now allows to implement its own cube quite easily. The initial format is a big table containing the extended cube. Each row follows the same format: n coordinates, value or list of values. When I need to create a view of this cube, I need to aggregate using a kind of SQL logic. It is fine to write a GROUP BY query when the number of variable is small but when it is high, I would appreciate to have a tool which would do that for me. The convention I chose is to have meaningful name. Like many convention, the name of a column should indicate the numeric type and the aggregation type you want (sum, avg, len, key, none, max, min).

key_name	sum_a	len_b	avg_c
A	1.0	2.0	3.0
A	1.1	2.1	3.1
B	3.0	4.0	5.0
To aggregate on name, I just write this:
gr = table.groupby_implicit ( lambda v : v ["key_name"] )
I do not have to specify the rest. Usually, I use sum most of the time. I'll probably add others aggregator later such as argmax. I added a module computing some basics statistics, the code is still here: Maybe one day, somebody will find something easy.


<-- -->

Xavier Dupré