Binary Search Trees and Scapegoat Trees Solution




In this project, you’ll start with a codebase for binary search tree (BST) similar to that presented in lectures. You’ll need to implement several additional methods for binary search trees. Then, you’ll create a subclass of BST called “scapegoat tree” — a simple form of a self-balancing binary search tree.

Learning Goals

Show understanding of binary trees, and implement several non-trivial methods . Learn and implement scapegoat tree – a form of self-balancing trees.

Continue to develop unit-testing skills.

General Information [Common]

Reminder: Copying partial or whole solutions, obtained from other students or elsewhere, is academic dishonesty. Do not share your code with your classmates, and do not use your classmates’ code. If you are confused about what constitutes academic dishonesty you should re-read the course policies. We assume you have read the course policies in detail and by submitting this project you have provided your virtual signature in agreement with these policies.

For some projects, it may be useful for you to write additional java files. Any java file you write MUST be placed in the provided src directory of your Eclipse project.

When submitting your project, be sure to remove all compilation errors. Any compilation errors will cause the autograder to fail to run, and consequently you will receive a zero for your submission. No Exceptions!

In the test directory, we provide several JUnit tests that we refer to as the public tests. We recommend you run the tests often and use them as starting points to test your project. In addition, some of the java files come with their own main functions. You can run them as Java applications to interact with the project.

Be aware that the autograder will use a more comprehensive set of tests (referred to as private tests) to grade your project. These tests are hidden from you. We recommend that you think about possible test cases and add new @Test cases to your public test files as part of your programming discipline. In particular, you should consider:

Do your methods handle edge cases such as integer arguments that may be positive, negative, or zero. Many methods only accept arguments that are in a particular range.

Does your code handle cases such as when an array or list is empty, or has reached full capacity?

More complex tests will be project-specific. To build good test cases, think about ways to exercise methods. Work out the correct result for a call of a method with a given set of parameters by hand, then add it as a test case.

Import Project into Eclipse [Common]

Begin by downloading the starter project and importing it into your workspace. It is very important that you do not rename this project as its name is used during the autograding process. If the project is renamed, your assignment will not be graded, and you will receive a zero.

The imported project may have some compilation errors, but these should not prevent you from getting started. Specif-ically, we may provide JUnit tests for classes that do not yet exist in your code. You can still run the other JUnit tests. After completing your code, the compilation errors should be gone.

The project should normally contain the following root items:

src This is the source folder where all code you are submitting must go. You can change anything you want in this folder (unless otherwise specified), you can add new files, etc.

support This folder contains support code that we encourage you to use (and must be used to pass certain tests). You must not change or add anything in this folder. To help ensure that, we suggest that you set the support folder to be read-only. You can do this by right-clicking on it in the package explorer, choosing Properties from the menu, choosing Resource from the list on the left of the pop-up Properties window, unchecking the Permissions check-box for Owner-Write, and clicking the OK button. A dialog box will show with the title “Confirm recursive changes”, and you should click on the “Yes” button.

test The test folder where all of the public unit tests are available. You can feel free to add your own tests here.

JUnit 4 A library that is used to run the test programs.

JRE System Library This is what allows Java to run; it is the location of the Java System Libraries.

If you are missing any of the above or if errors are present in the project (other than as specifically described below), seek help immediately so you can get started on the project right away.

Problem 1

Note that for both parts of this project, you are allowed to use classes that implement the Collection class when needed. For example, you may use java.util.ArrayList, java.util.Queue etc. in your implementations, which eliminates the need to implement these yourself.

What to Do

Take a look at BinarySearchTree and BSTInterface.As taught in lectures, BSTs must obey the ordering rules, and the rules must be preserved when inserting or removing nodes. Before you start, please carefully read for specifications of each method, including expected return values and exceptions that need to be thrown. Your first task is to implement the methods that are not yet implemented (that is, whose method bodies are marked with TODOs). In doing so, you may find that you need or want to change other methods. This is generally OK, with the following restrictions: Your changes must NOT break the semantics of the methods. And you may NOT change the signatures of the public methods, or add or remove public methods to the interface.

Here are some hints for the methods we’ve left for you to complete. Again, go to to read the full details about each method:

contains: You should implement this method in terms of the get method.

get: returns the element being searched for; or null if it doesn’t exist in the BST.

getMinimum and getMaximum: returns the minimum and maximum elements respectively.

height: The height of the tree is the height of the node that is furthest from the root. A recursive solution is probably the easiest.

pre- and postorderIterator: These methods are probably easiest to implement in terms of a Queue

see inorderIterator for an example. Recursively traverse the tree in the correct order, and insert each node into the queue as it is visited. Then, just return the result of calling iterator() on the queue.

equals: returns true if two trees are identical (i.e. contains the equal elements and the tree structure is the same). This method is probably easiest to express with a recursive helper method.

sameValues: Have you already implemented a method that imposes a deterministic order on the values in the tree? (Spoiler: yes.) If so, that will make writing this method straightforward.

Problem 1 continued on next page. . . Page 4 of 7

isBalanced: For this project, a tree is considered balanced when 2h n < 2h+1, where h is the tree’s height, and n its size.

balance: Review the lecture slides for this method.

And here are some notes on some of the utility methods we’ve provided to you, that you should NOT change:

getRoot: This method returns the root node of the tree. Normally, you wouldn’t expose such a detail in your implementation, but we require it in order to run autograder tests. You may want to use it in your own testing, or with the following method.

toDotFormat: This method will output a representation of the tree rooted at the given node in the DOT language, as described by its left and right child references. There are many programs that can read this language and display the results. For example, html allows to do so in your browser. This is very helpful for visualizing the tree and debugging.

Finally, if you need to allocate an array of generic (T) objects in your implementation of BinarySearchTree: in the past we’ve been doing so with T[] array = (T[]) new Object[size]; For this project, you will find that just doing the above you’ll still get a ClassCastException. Why? Because T is no longer an un-constrained generic type. Because in the interface its full signature is T extends Comparable<T>, to sat-isfy this requirement, you’ll need an array capable of holding objects compatible with that type, so you must use

T[] array = (T[]) new Comparable[size]; instead.


You’ll note that for this problem (and the next one), we have provided only a small set of tests. These are an absolutely minimal set of tests, and definitely do not cover all possible cases!

You will need to come up with some tests of your own. Try to think of all of the cases that could occur in each method you write, and write tests to check each of them. Look back at previous assignments’ tests for a refresher if you’re not sure how to do this. You will not be graded on your tests, but writing them and passing them is the only way that you can be reasonably sure that your code works.

Problem 2

Self-Balancing Binary Trees

Just obeying the BST rule is not sufficient to achieve O(log n) performance when searching in a binary search tree. The tree must be balanced, or close to be balanced, so that the height of the tree is on the order of O(log n). This should be done automatically with mathematically proven gaurantees. Therefore manually calling balance() yourself from time to time does not suffice.

The solution to this problem is to build a self-balancing tree, that is, a tree that makes certain guarantees across calls to add or remove. This enforces that the tree remains approximately balanced, and does so at an amortized low cost. (You can think of this as an analog to the technique of resizing the array that backs an array-based stack or queue: by doing so infrequently and in a specific way, we still achieved asymptotically good performance.)

In upper level CS courses you’ll learn about one or more of the self-balancing BSTs that are used in practice, such as red-black trees or AVL trees. Here, we’ll focus on a simpler, but still reasonably effective, form of self-balancing tree: the scapegoat tree.

Scapegoat Trees

The scapegoat tree is based on the common wisdom that, when something goes wrong, the first thing people tend to do is find someone to blame (the scapegoat). Once blame is firmly established, we can leave the scapegoat to fix the problem. A ScapegoatTree is named after this. Specifically, A ScapegoatTree is a BST that, in addition to keeping track of the number of nodes and the height of the tree, it also keeps track of an upperBound (q in our slides) that maintains an upper-bound on the number of nodes.

Suppose n is the number of nodes, h is the height of the tree, and q is the upperBound. Then, the scapegoat tree is required to satisfy the following rules at all times:

q=2 n q ; h log3=2(q)

Combining the two inequalities, we can see that1

h log3=2(q) log3=2(2n) < log3=2(n) + 2

so the height is bounded by O(log n), which is the guarantee we want.

What to Do

You will implement ScapegoatTree as a subclass to BinarySearchTree. You will need to implement the add and remove methods in this subclass (and the other methods are the same as BST so do not need to be re-implemented). Before you start, please review the lecture slides which contain important details about these two methods. Below is a quick summary.

To implement the add method, first increment upperBound and then use the usual BST insertion algorithm (already provided for you) to add the new element to the BST. At this point, you may get lucky and the height of the tree might not exceed log3=2 upperBound. If so, nothing else needs to be done.

Unfortunately, it will sometimes happen that, by adding this node, you’ve increased the tree’s height to greater than log3=2 upperBound. In this case, the height condition is violated, and you need to reduce the height to fix it. To do so, let’s call the newly added node as u. We will start from u and follow its parents back towards the root to find a scapegoat, called w. The scapegoat w is a very unbalanced node. In fact, it is the first node (on the path from u towards root) that satisfies sizeOfSubtree(w.child) / sizeOfSubtree(w) > 23 . Here w.child is the child of w on the path from u towards root.

We’ll omit the proof the the scapegoat exists – you can take it for granted.2 Once you’ve found the scapegoat w, rebuild the subtree rooted at w into a balanced BST. Be careful! You must balance the subtree, then graft the subtree’s root into the place w formerly occupied in the original tree. In particular, make sure you set the appropriate child reference in w’s original parent to point to the now-balanced subtree’s root, and adjust the subtree’s root node’s parent pointer.

It can be mathematically proven that the above process will reduce the tree’s height and recover the height condition. The proof is omitted here. If you are curious about these proofs, take a look at wiki/Scapegoat_tree

Implementing remove(element) in a ScapegoatTree is actually simpler than that of add. To do so, first remove the element using the usual BST removal alrogithm (already provided for you). The upperBound remains the same (that is, do NOT decrement upperBound). The removal will certainly not increase the height, but it can reduce the size of the tree such that it may violate the upperBound condition. So check if upperBound > 2 size. If so, the upperBound rule is violated. To fix it, simply rebuild the entire tree into a balanced BST (by calling the balance method), Finally, reset upperBound to be equal to the BST’s size.

1Recall that logx y = logz y for any real x; y; z.

logz x

  • That is, if your code doesn’t find it, it’s due to an error in the code, not the algorithm. The scapegoat’s existence is guaranteed.

Some Hints

There are a few different ways to implement the scapegoat tree in order to successfully find the scapegoat node during insertion. One way is to modify BSTNode to maintain a reference to the node’s parent. This way, from any node, you can follow the parent pointer to traverse back the ancestor chain until you find the scapegoat. You must make sure to set the parent pointer correctly (Hint: this can be done in BSTNode’s setLeft and setRight methods, by inserting some additional code to maintain the parent pointer).

There are at least two ways to do this problem without using parent references at all. For example, you could write a method to find and store the list of nodes from the root to the newly added node u. This is basically the list of ancestors of u. Then use that list to search for the scapegoat. Or you could even temporarily flip child pointers to point at the parent, as described in the Wikipedia article.

Any of these approaches are acceptable – we won’t be testing that your parent pointers are correct, only that your tree obeys both the BST rule and the scapegoat rule (and does not just call balance after every add or remove, but rather behaves as described above).

You may want to change some of the helper methods in BinarySearchTree from private to protected, so that ScapegoatTree can access them. That’s fine. You may also want to change some of their signatures, or add new protected or private methods to either class, or override more methods in ScapegoatTree. Any of that is fine too, but remember that you should NOT change the signatures of the public methods, as doing so may break the BinarySearchTree!

Export and Submit [Common]

When you have completed this project, you should export an archive file containing the entire Java project. To do this, click on the bst-scapegoat-student project in the package explorer. Then choose “File ! Export” from the menu. In the window that appears, under “General” choose “Archive File”. Then choose “Next” and enter a destination for the output file. Be sure that the project is named bst-scapegoat-student (otherwise you may be exporting the wrong project!). Save the exported file with the zip extension.

Once you have the zip file ready, you can submit it to the online autograder (Gradescope). Please see the course website and/or documentation explaining how to use the online autograder. If you have any questions please be prompt in asking the course staff before it is too late to submit your solution. If you encounter any errors that look like the autograder failed and not your project, please let the instructors know.

Page 7 of 7