XD blog

blog page

alignment, edit distance, graph, levenstein, matching, python

2013-01-30 Graph matching and alignment

Pipelines became a quite popular way to represent a workflow. Many tools such as RapidMiner, Orange or Weka use graphs to represent the sequence of processes data follow. Most of the time, between two experiments, we just copy paste a previous one, we add or delete a few nodes. After a while, it becomes uneasy to find out what modifications were made. I was wondering how it would be possible to automate such a task. How to find what modifications were introduced in a graph ?

Graph 1 Graph 2 Two graphs we would like to merge into a single one. Vertices 00 and 11 represents the only root and the only leave.

The following paper gives an answer to those questions for graphs without cycles. The outcome is a way to match two graphs altogether and to build an alignment between edges and vertices from both graphs. This method is based on Levenstein's edit distance.

Graph 1 Graph 2 Merged graph

You can find an implementation here: graph_distance.py, importme.py. This algorithm is still very slow and could be improved. Maybe for another blog post.

<-- -->

Xavier Dupré