06/03/2017

[Java] BST element distance - path lists

The exercise that stole too much time from me last time was about binary search trees. The description is quite simple: find the distance between two elements in a given tree; that means how many hops do you need to do to navigate to element B from element A; if the elements are the same, distance is 0, if one or both elements are missing, distance is -1.

Actually, the exercise included a bit more than that: given an input array of integers build an unbalanced binary search tree, that is, simply add each element one after the other following the binary insertion logic: if the element you are inserting is smaller than or equal to the current element, put it on the left side, otherwise go on the right side.

Duplicate elements are allowed. Also as input you get two integers that are the elements for which you want to compute the distance between.

Down to coding, building the tree like that is quite straightforward (fun fact: given a single TEST source file, putting the tree structure declaration and all the logic in the same TEST class - which is NOT the actual tree class - and giving the signature for the distance method as static adds a bit of unnecessary complexity in my opinion, mainly for tree generation operations).

As for the distance calculation, since we are given the two elements between which we need to compute the distance alongside the input we use to build the tree, it's possible to save some steps by keeping track of the presence of both elements in the tree while we build it. At the end of the build phase, if we know that at least one element is not in the tree, we can already answer with -1 and skip the calculation logic.
However, in the code presented here I took a more general approach, so the tree is built before we query it for distances, meaning we can either add a contains method and find out if both elements are there or skip that as well and expand the calculation logic a bit more. Just for exercise, I used the contains method.

Lastly, I will propose two versions of the solution. The one shown here assumes we have no parent pointer in our nodes so we can only traverse the tree top-down; last time I got some nasty NullPointerException when sticking to the parent pointer approach that cost me precious time.

Getting to the meaty part, here is the first solution idea: navigate the tree twice and keep track of the path to both elements in separate lists, then walk both lists at the same time to determine the divergence point: if we reach the end of either list before that happens, the distance is simply the difference between the two paths because one element must be the child of the other; if we find a divergence point, the distance is then the sum of the remaining path lengths after it.

This approach uses more space because of the two additional lists compared to only having one BST with parent pointer, but at least the lists cannot be longer than the height of the tree, which in the worst case is O(n) since it's an unbalanced tree.

As for time, worst case is:
  • read through the whole input to create the tree is O(n)
  • find out if both elements are in the tree is O(n)
  • navigate the tree twice to find the paths to the elements is O(n)
  • walk the paths to find the divergence is O(n)
since we are just running these operations sequentially, we get O(n) overall. And as said, we can reduce the generalisation of this approach and compact some functions to avoid repeating unnecessary operations.

You can check the code and a sample JUnit test source on my Gist. The full project was developed with IntelliJ and Gradle using JDK 8.

You can also find the second approach using the lowest common ancestor logic described here and the parent pointer approach here.

No comments:

Post a Comment

With great power comes great responsibility