Java Data Structures  Contents
A D V E R T I S E M E N T
Every database system must provide for deletion of
items. A Binary Search Tree is a very good option for implementing a database (fast
searches, sorts, inserts, etc.). One problem that stands in the way though is that it's
not very straight forward to delete an item from a database represented by a binary tree.
More precisely, the problem is in maintaining the tree structure while the deletion
process.
Deleting items from a Binary Search Tree can be
rather tricky. On one hand, you'd like to remove the object from the tree, on the other,
you don't want the process to destroy the tree structure. There are many algorithms for
this, and they all vary in degree of simplicity and efficiency.
The simplest one (which we will not cover here), is
to simply have a boolean variable inside a node. That boolean
variable tells us if that node is valid or not. Whenever we want to delete that node, we
simply set that valid variable to false ; making all the traversal functions
simply skip that node. The actual removal can take place when the tree is rebuilt. This
may seem like a waste of space, but most of the time, it's not (in today's world, several
extra bytes inside the tree structure don't make much of a difference).
The above can actually be used in conjunction with a
more complicated approach. For example, lets take a regular company database. Throughout
the day, there are thousands of transactions, deletions, insertions, etc., all this is
handled by the algorithm described above (a boolean valid variable). At the
end of the day (night), when the system becomes free, the program does a routine cleanup
of the tree from these "deleted" items. Since setting or unsetting a boolean
variable can be fast, the database is really fast during the day, i.e.: when it matters.
Cleaning up a tree from these types of "deleted"
items can be accomplished in many different ways. If there are too many of them, then it
might be worthwhile to simply rebuild the tree from scratch (making sure it doesn't loose
it's advantageous structure, i.e.: doesn't become a linked list). The program could also
fire up a more complicated approach to delete each node individually. (While at
it, the system could also be optimizing the tree structure ;)
Now, what is that "more complicated"
approach that I keep talking of? It is the approach of switching the node being deleted
with the one currently in the tree, only at lower level. Deleting a node that has no
children is pretty simple, we just remove that node (and set the parent's pointer to it to
null ). Deleting a node that has one child is also easy. We delete the node,
and make it's parent point to that only child of the deleted node.
The problem comes when you try to delete a node
which has two valid children. Which one do you pick to be it's successor (take it's
parent's place)? Actually, in most cases, neither!
You have to realize that we're dealing with a binary
search tree. Search trees have very specific properties. For example, if we need to remove
a node, we look for nearest right child that doesn't have a left son (or nearest left
child, that doesn't have a right son). We then replace that newly found node with the one
we are trying to delete (and making sure that all the links go where they should). The
process of deleting that right child with no left son actually involves a simple removal
described above: i.e.: the node is simply replaced by it's only child (in this case the
right child; since there is no left). Confusing? Lets jump into the code to clear it up...
import java.lang.*;
import java.io.*;
public class pBSTRemoveNode{
public pBSTRemoveNode left,right;
public Comparable data;
public static pBSTRemoveNode tree_AddNumber(
pBSTRemoveNode r,Comparable n){
if(r == null){
r = new pBSTRemoveNode();
r.left = r.right = null;
r.data = n;
}else if(r.data.compareTo(n) < 0)
r.right = tree_AddNumber(r.right,n);
else if(r.data.compareTo(n) > 0)
r.left = tree_AddNumber(r.left,n);
return r;
}
public static pBSTRemoveNode tree_removeNumber(
pBSTRemoveNode r,Comparable n){
if(r != null){
if(r.data.compareTo(n) < 0){
r.right = tree_removeNumber(r.right,n);
}else if(r.data.compareTo(n) > 0){
r.left = tree_removeNumber(r.left,n);
}else{
if(r.left == null && r.right == null){
r = null;
}else if(r.left != null && r.right == null){
r = r.left;
}else if(r.right != null && r.left == null){
r = r.right;
}else{
if(r.right.left == null){
r.right.left = r.left;
r = r.right;
}else{
pBSTRemoveNode q,p = r.right;
while(p.left.left != null)
p = p.left;
q = p.left;
p.left = q.right;
q.left = r.left;
q.right = r.right;
r = q;
}
}
}
}
return r;
}
public static void tree_InOrderPrint(
pBSTRemoveNode r){
if(r != null){
tree_InOrderPrint(r.left);
System.out.print(" "+r.data);
tree_InOrderPrint(r.right);
}
}
public static void main(String[] args){
pBSTRemoveNode tree = null;
int[] numbers = {56,86,71,97,82,99,65,36,16,10,28,52,46};
System.out.print("inserting: ");
for(int i = 0;i<numbers.length;i++){
Integer n = new Integer(numbers[i]);
System.out.print(" "+n);
tree = tree_AddNumber(tree,n);
}
System.out.print("\ntree: ");
tree_InOrderPrint(tree);
for(int j = 0;j < numbers.length;j++){
Integer n = new Integer(numbers[j]);
System.out.print("\nremove: "+n+" tree: ");
tree = tree_removeNumber(tree,n);
tree_InOrderPrint(tree);
}
System.out.println("\ndone ;)");
}
}
If you look through the above code, you'll quickly
realize that the most relevant function (to this section), is the tree_removeNumber() .
It accepts a reference to the root of the tree, and a number to remove.
Actually, the way it's implemented, it doesn't necessarily has to be a number. It could
just as well be a java.lang.String object; anything that's implementing a Comparable
interface will work. The title of the function is a bit misleading; telling you
that you can only work with numbers.
The main() is rather straight forward;
it first adds numbers to the tree, and then removes them, displaying what it does at each
step. The tree_AddNumber() function will not be explained here (since it's
already been explained somewhere within this document).
The tree_removeNumber() method is where
most of the interesting action takes place. We first check to see if the root of the tree
is not null , since if it is, we have nothing to remove, and we simply return .
The next two if() and else if() statements do what is know as
binary search. If the item that we're looking for is greater than the value of the current
node, we recursively search the right child of the current node. If it's less than the
value of the current node, then we recursively search the left child of the current node.
The remainder of the function is kind of a big else
statement. Once we're there, it means we have found the item we were looking for. We start
by first checking for simple cases, where the node we're removing has no children, or has
only one child. If it has two valid children, we fall into another else
statement.
Once we know it is not one of the simple cases, we
have to do a bit of thinking. First, we check a bit easier case, where the right child of
the node doesn't have a left child. If that's the case, we need go no further, we simply
replace the removing node with it's right child, and make sure the links are not lost. If
the right child has a left child, we have to loop to find the closest right child with no
left child, and that's what that while() loop is doing. The moment we find
that node we would like to put in place of the one being deleted, we remove the found
node. This removal is rather simple, since the node only has a right child. We then simply
replace the removed node with the new one, making sure we don't loose any links, and we
are done.
Inside main() , I have picked numbers to
work with, and to construct the tree (sample data). The numbers generate a pretty good
binary tree. The removal starts with the root node, so, the example does quite extensive
testing in all the cases described above. The output from the above program follows:
inserting: 56 86 71 97 82 99 65 36 16 10 28 52 46
tree: 10 16 28 36 46 52 56 65 71 82 86 97 99
remove: 56 tree: 10 16 28 36 46 52 65 71 82 86 97 99
remove: 86 tree: 10 16 28 36 46 52 65 71 82 97 99
remove: 71 tree: 10 16 28 36 46 52 65 82 97 99
remove: 97 tree: 10 16 28 36 46 52 65 82 99
remove: 82 tree: 10 16 28 36 46 52 65 99
remove: 99 tree: 10 16 28 36 46 52 65
remove: 65 tree: 10 16 28 36 46 52
remove: 36 tree: 10 16 28 46 52
remove: 16 tree: 10 28 46 52
remove: 10 tree: 28 46 52
remove: 28 tree: 46 52
remove: 52 tree: 46
remove: 46 tree:
done ;)
Trace the output if you like, it's quite
interesting. This type of removing actually improves the tree structure, it makes it
shorter (decreases tree's depth), thus, making searches faster. One thing that I'd like to
mention before we go on, is that this example is for JDK 1.2 or above. It will not compile
(nor run) on anything less due to the fact that it uses the java.util.Comparable
interface , which is not supported in earlier versions of the JDK. Anyway, I guess
that's it for this topic.
Back to Table of Contents
A D V E R T I S E M E N T
