Pipelines became a quite popular way to represent a workflow. Many tools such as RapidMiner, Orange or Weka use graphs to represent the sequence of processes data follow. Most of the time, between two experiments, we just copy paste a previous one, we add or delete a few nodes. After a while, it becomes uneasy to find out what modifications were made. I was wondering how it would be possible to automate such a task. How to find what modifications were introduced in a graph ?
|Graph 1||Graph 2||Two graphs we would like to merge into a single one. Vertices 00 and 11 represents the only root and the only leave.|
The following paper gives an answer to those questions for graphs without cycles. The outcome is a way to match two graphs altogether and to build an alignment between edges and vertices from both graphs. This method is based on Levenstein's edit distance.
|Graph 1||Graph 2||Merged graph|