12

I have created an algorithm whose purpose should be of, given two nodes A and B in a BST, it switches the roles (or positions in the tree) of the two by simply moving pointers. In my representation of a BST, I am using a double linked connection (i.e. A.parent == B and (B.left == A) or (B.right == A)). I am not sure if it's completely correct or not. I have divided the algorithm in two situations.

  1. A and B are directly connected (either A is the parent of B or B the parent of A)

  2. All the other cases

For each of the previous cases I have created a nested function. I would like to have your opinion on the first the correctness of the algorithms and if I can somehow then improve it. Here's the code:

def switch(self, x: BSTNode, y: BSTNode, search_first=False):
    if not x:
        raise ValueError("x cannot be None.")
    if not y:
        raise ValueError("y cannot be None.")
    if x == y:
        raise ValueError("x cannot be equal to y")

    if search_first:
        if not self.search(x.key) or not self.search(y.key):
            raise LookupError("x or y not found.")
    
    def switch_1(p, s):
        """Switches the roles of p and s,
        where p (parent) is the direct parent of s (son)."""
        assert s.parent == p
        
        if s.is_left_child():
            p.left = s.left
            if s.left:
                s.left.parent = p
        
            s.left = p
            
            s.right, p.right = p.right, s.right
            if s.right:
                s.right.parent = s
            if p.right:
                p.right.parent = p
        else:
            p.right = s.right
            if s.right:
                s.right.parent = p
                
            s.right = p

            s.left, p.left = p.left, s.left
            if s.left:
                s.left.parent = s
            if p.left:
                p.left.parent = p
        
        if p.parent:
            if p.is_left_child():
                p.parent.left = s 
            else:
                p.parent.right = s
        else:  # p is the root
            self.root = s
            
        s.parent = p.parent
        p.parent = s

    def switch_2(u, v):
        """u and v are nodes in the tree
        that are not related by a parent-son
        or a grandparent-son relantionships."""
        if not u.parent:
            self.root = v
            if v.is_left_child():
                v.parent.left = u
            else:
                v.parent.right = u
        elif not v.parent:
            self.root = u
            if u.is_left_child():
                u.parent.left = v
            else:
                u.parent.right = v
        else:  # neither u nor v is the root                
            if u.is_left_child():
                if v.is_left_child():                   
                    v.parent.left, u.parent.left = u, v
                else:
                    v.parent.right, u.parent.left = u, v
            else:
                if v.is_left_child():                   
                    v.parent.left, u.parent.right = u, v
                else:
                    v.parent.right, u.parent.right = u, v                    
                
        v.parent, u.parent = u.parent, v.parent
        u.left, v.left = v.left, u.left
        u.right, v.right = v.right, u.right

        if u.left:
            u.left.parent = u
        if u.right:
            u.right.parent = u
        if v.left:
            v.left.parent = v
        if v.right:
            v.right.parent = v
    
    if x.parent == y:
        switch_1(y, x)            
    elif y.parent == x:
        switch_1(x, y)
    else:
        switch_2(x, y)

I really need that switch works in all cases no matter which nodes x or y we choose. I have already done some tests, and it seems to work, but I am still not sure.

EDIT

Eventually, if it's helpful somehow, here you have the complete implementation of my BST (with the tests I am doing):

https://github.com/dossan/ands/blob/master/ands/ds/BST.py

EDIT 2 (just a curiosity)

@Rishav commented:

I do not understand the intention behind this function.. if it is to swap two nodes in the BST, is it not sufficient to swap their data instead of manipulating pointers?

I answered:

Ok, maybe I should have added a little bit more about the reason behind all this "monster" function. I can insert BSTNode objects or any comparable objects in my BST. When the user decides to insert any comparable object, the responsibility of creating the BSTNode is mine, therefore the user has no access to a initial BSTNode reference, unless they search for the key. But a BSTNode would only be returned after the insertion of the key, or there's already another BSTNode object in the tree with the same key (or value), but this latter case is irrelevant.

The user can also insert a BSTNode object in the tree which has an initial (and should remain constant) key (or value). Nevertheless, if I just exchanged the values or keys of the nodes, the user would have a reference to a node with a different key then the key of the node he inserted. Of course, I want to avoid this.

Community
  • 1
  • 1
nbro
  • 15,395
  • 32
  • 113
  • 196
  • What are you trying to do with the line of code ```s.right, p.right = p.right, s.right```? I am not sure if it is actually doing what you think it is, take a look at http://stackoverflow.com/questions/11502268/how-does-pythons-comma-operator-works-during-assignment – kyle Feb 01 '16 at 22:57
  • If I get it correctly you want to exchange two nodes in a binary search tree while preserving all the rest of the tree untouched. Could you explain the reason to make such operation? BST has quite strong ordering property; once you exchange any two, arbitrarily chosen nodes in it, it is *no longer a BST*. As a result you might be unable to do another such swap, because `self.search(x.key)` or `self.search(y.key)` will *fail to find existing nodes*! – CiaPan Feb 01 '16 at 22:57
  • @kyle I am exchanging the values of the right child of `s` and `p`, i.e. the right child of `s` becomes the right child of `p`, and vice-versa. – nbro Feb 01 '16 at 22:59
  • Also are you trying to do a tree rotation? Because you can't ensure that the BST remains balanced by just swapping nodes? See https://en.wikipedia.org/wiki/Tree_rotation – kyle Feb 01 '16 at 22:59
  • @CiaPan I know very well the properties of a BST. When I created the my delete function, I needed a "swap" (or "switch") function for swapping a node and it's successor. Now, I just decided to make that function work for all cases of swapping nodes. The previous one just worked for that specific case, but I would like to have a function that works in all cases, because I could eventually change the implementation of my delete function, etc. – nbro Feb 01 '16 at 23:02
  • @kyle No, I had already implemented a "left" and "right-rotate" functions. I just wanted a function that swaps the roles of any given two nodes, no matter if that operation will break or not the BST property. – nbro Feb 01 '16 at 23:03
  • I do not understand the intention behind this function.. if it is to swap two nodes in the BST, is it not sufficient to swap their data instead of manipulating pointers? – xrisk Feb 07 '16 at 04:46
  • @Rishav Ok, maybe I should have added a little bit more about the reason behind all this "monster" function. I can insert `BSTNode` objects or any comparable objects in my BST. When the user decides to insert any comparable object, the responsibility of creating the `BSTNode` is mine, therefore the user has no access to a initial `BSTNode` reference, unless they search for the key. – nbro Feb 07 '16 at 04:54
  • @Rishav The user can also insert a `BSTNode` object in the tree which has an initial (and should remain constant) key (or value). Nevertheless, if I just exchanged the values or keys of the nodes, the user would have a reference to a node with a different key then the key of the node he inserted. Of course, I want to avoid this. – nbro Feb 07 '16 at 04:55
  • Consider editing your question. Part of it asks how we might improve your solution. I put up an answer suggesting how you might improve it, and you said you asked something different. – David Jay Brady Feb 13 '16 at 19:08
  • @DavidJayBrady **Eventually** to improve it, but what I really would like is a proof it works or not in all cases, like I said in the comment. – nbro Feb 13 '16 at 19:09

2 Answers2

1

you need proper unit testing. I recommend python-nose - very easy to use.

As for the test vectors I'd recommend using every potential combination of two nodes a and b:

In the case of BST trees you have 3 types of nodes:

  1. leaf node,
  2. 1-child node,
  3. 2-children node.

in combination with the following additional cases:

  1. a is root, or
  2. a is the parent of b,
  3. a is not the parent of b.

and their combinations as well (also in the symmetric situation).

then after swapping you'll need to check all the nodes involved i.e.: a,b, children of a and b, parents of a and b if everything went as planned.

I'd do that using a small tree that contains all the types of nodes. Then go through all possible combinations of the nodes and swap the nodes and check against the expected outcome, and then swap again to bring the tree back to its original state.

[ EDIT ]

If your question was how to avoid all the tedious work. You may consider looking for some well established BST implementation and compare results with your function. Vectors can be created automatically by using a prepared tree and generating all possible pairs of nodes of this tree.

[/EDIT]

As for the unwanted input to the function. You'll need to use your imagination although in my opinion you have most of the cases covered. Except the one that Austin Hastings mentions where at least on of the input nodes does not belong to the tree.

I found an old version of the same function written for one of my private projects, maybe you can find it useful:

def swap( a, b ):
    if a == b: return
    if a is None or b is None: return
    #if a not in self or b not in self: return

    if b.parent == a:
        a, b = b, a

    #swap connections naively
    a.parent, b.parent = b.parent, a.parent
    a.left, b.left = b.left, a.left
    a.right, b.right = b.right, a.right

    if b.parent == b: #b was the p of a
        b.parent = a

    if a.parent is not None:
        if a.parent.left == b: a.parent.left = a
        else: a.parent.right = a
    else:
        self.root = a

    if b.parent is not None:
        if b.parent.left == a: b.parent.left = b
        else: b.parent.right = b
    else:
        self.root = b

    if a.right is not None: a.right.parent = a
    if a.left is not None: a.left.parent = a
    if b.right is not None: b.right.parent = b
    if b.left is not None: b.left.parent = b

and performance optimised version:

def swap_opt( a, b ):
    if a == b: return
    if a is None or b is None: return
    #if a not in self or b not in self: return

    if b.p == a:
        a, b = b, a

    #swap connections naively
    a.p, b.p = b.p, a.p
    a.l, b.l = b.l, a.l
    a.r, b.r = b.r, a.r

    if b.p == b: #b was the p of a
        b.p = a
        if a.l == a:
            a.l = b
            if a.r is not None: a.r.p = a
        else:
            a.r = b
            if a.l is not None: a.l.p = a

        if b.r is not None: b.r.p = b
        if b.l is not None: b.l.p = b
        if a.p is not None:
            if a.p.l == b: a.p.l = a
            else: a.p.r = a
        else:
            #set new root to a
            pass

    else:
        if a.r is not None: a.r.p = a
        if a.l is not None: a.l.p = a
        if b.r is not None: b.r.p = b
        if b.l is not None: b.l.p = b

        if a.p is not None:
            if a.p.l == b: a.p.l = a
            else: a.p.r = a
        else:
            #set new root to a
            pass
        if b.p is not None:
            if b.p.l == a: b.p.l = b
            else: b.p.r = b
        else:
            #set new root to b
            pass

I haven't done proper unit tests for this code - it worked as I expected it to. I was more interested in performance differences between the implementations. swap_opt handles neighbouring nodes a bit faster giving it around 5% of speed increase over the compact implementation of swap. [EDIT2] But that depends on the tree used for testing and hardware [/EDIT2]

Tomek
  • 607
  • 9
  • 16
0

Your BST.py defines class BST. Members of that class have an element, self.root that can point to a node. Your code, as shown, does not account for this.

I believe you need to handle these cases:

  1. Swap the root node with one of its children.
  2. Swap the root node with a non-child.
  3. Swap a non-root node with one of its children.
  4. Swap a non-root node with a non-child non-root node.

Edit: After re-examining switch_1, I think you do handle all the cases.

Also, there is the possibility that a caller could request you swap a node that is not a member of the tree for a node that is a member. Or swap two nodes that are both not members of the current tree. It would cost some code to detect these cases, but you could probably get by with a dict or set to trace tree membership. I don't know if you want to consider "swap-ins" as a valid operation or not.

In several places you compare nodes using ==. That is an operation that can be overridden. You should use is and is not for identity comparisons and comparisons against None.

Finally, please consider Pythonifying your BST class. It is a mutable iterable container, so it should support the standard operations as much as possible.

aghast
  • 14,785
  • 3
  • 24
  • 56
  • My code takes `self.root` into account (maybe not completely correct, but it takes). Have a closer look. – nbro Feb 12 '16 at 04:00
  • Regarding the use of `==`, I have indeed overridden `the __hash__` function in the [`BaseNode`](https://github.com/dossan/ands/blob/master/ands/ds/BaseNode.py) class, but not `__eq__`... If you want you can have a look at the implementations. In general, I have made the comparisons against `None` always using `is` or `is not`, as you are suggesting. Regarding your last suggestion, probably I will do it when I'm pretty sure that all the operations do not contain bugs. – nbro Feb 12 '16 at 04:07
  • Anyway, I didn't want to override `__eq__` (and all the similar methods) because in my case I don't want to consider two nodes with the same key necessarily the same, but only if they are actually pointing to the same object. – nbro Feb 12 '16 at 04:10
  • Ah! I missed it in switch_1. My bad. – aghast Feb 12 '16 at 04:13
  • If you want you can download the whole repository and try the tests by running `./BST.py`. You need first to install the _package_. If you read the `README.md`... If you have a look at the tests at the end of the script for `switch`, I'm also testing switches of nodes where the `self.root` is involved... – nbro Feb 12 '16 at 04:24
  • I want to be honest with you, I appreciate your help and effort, but this answer does not deserve my 50 reputation points I'm giving away, which will be assigned to you if nobody else answers (as you might well know). The point was to have a proof that my code works or an explanation and enumeration of the cases where it doesn't. – nbro Feb 12 '16 at 14:36
  • Regarding your suggestion (that you added thereafter) of keeping track of the current nodes in the tree with a set or dict I'll take it into consideration. But, in general, I think that this should be responsibility of the client after providing a good documentation of what the API offers or not. For example, in another case, I think it's in the `insert` method of the same class, I'm doing some tests at the beginning, for example, if the parent of a node is `None`, but it isn't the `self.root`, then it can be in the tree.. – nbro Feb 12 '16 at 14:44
  • I can do other similar tests, but for example it would be a overkill to search the nodes first. So the idea is to well document the method, enumerate the server's and client's responsibilities, and explicitly stating undefined behaviours. – nbro Feb 12 '16 at 14:47