XD blog

blog page

machine learning, python, scikit-learn

2019-04-01 Determines close leaves in a decision tree

That's a problem I had in mind yesterday. When scikit-learn builds a decision tree, we might want to say which classes share a border with another one, which I translated by which couples of leaves of a decision tree share a border. The final node determines which feature to use to split between two leaves and two classes. What can we say about two leaves far away in the tree structure? Do they share a border? We could use the training data to build a kind of Voronoï diagram for points and group cells which belong to the same leave. What if we do not have the training data?

My answer is implemented somewhere on my website. This question was something I was looking into to imagine a way to build a continuous piecewise linear regression with at least two features... which is impossible but still finding close leaves seemed a good algorithmic problem.

<-- -->

Xavier Dupré