Introduction to Data Structures and Algorithms
Assignment 4: Trees and Traversal

  • Due 14 June , 7pm

What To Hand In

  • MeasuringQueues.java
  • Report.pdf (with your analysis and the results of your measurements)
  • DecisionTrees.java

Do not rename these files.

Remember to submit all of these files. When you have submitted them, check that you can read the files listed on the submission page, and complete the submission process.

Part 1: Analysing and Measuring Queues (Weight: 1/3)

For this question, you have to analyse some algorithms and also implement them and measure their performance. You must submit a pdf file called Report.pdf that contains your analysis and the results of your measurements. Note that there is no core/completion/challenge for this question.

In the HospitalER assignment, part of the core asked you to use the priority of the patients in the queue of Patients in the waiting room. Three possible ways of doing this were:
  • Use a PriorityQueue of Patients, (where poll() always returns the highest priority Patient in the queue).
  • Use an ArrayList of Patients, with the head of the queue at the front of the list, and always sort the list (by priority - remember low numbers mean high priority) after adding a Patient.
  • Use an ArrayList of Patients, putting the head of the queue at the end of the list, and always sort the list (by priority) after adding a Patient. Note that if the head of the queue is at the end of the list, the list must be sorted in reverse order to make the highest priority (lowest number) Patient be at the head of the queue.

We would expect that using the PriorityQueue is a more efficient option, but how much more efficient? Does it really matter?

Analysis (30%)

(a): Priority Queue:

If you used a PriorityQueue with n items, what would be the cost of adding an item to the queue and then removing an item from the queue? (Express the cost in big-O notation in terms of n).

      poll an Item from PriorityQueue
      offer a new Item to the PriorityQueue

(b): ArrayList, head at front

If you used an ArrayList (with the head of the queue at the front of the list) and called Collections.sort after every time you added a Item to the queue, what would be the cost of adding an item to the queue and then removing an item?

      add new Item to list     //enqueue
      sort list
      remove Item from list(0) //dequeue

(c): ArrayList, head at end
If you used an ArrayList (with the head of the queue at the end of the list) and called Collections.sort after every time you added a Item to the queue, what would be the cost of adding an item to the queue and then removing an item?

      add new Item to at position 0 of list   //enqueue
      sort list in reverse order
      remove Item from end of list //dequeue

Note:
  • Sorting a list that is almost sorted, and only has one item out of order is quite fast - only O(n) steps.

Measuring (70%)

Measure the actual performance of these three different approaches by writing code for each way of doing a priority queue, and then measuring the time it takes to run the algorithm on a queue with n items, for 11 different values of n: n=1,000, n=2,000, n=4,000, ... n=1,024,000.

Note: adding one item and removing one item will be so fast that you probably can't actually measure it directly (even for large n). You will have to repeat the two operations lots of times (eg 100,000 times), measure the total time it took, and then divide by 100,000 to get the cost of doing it once.

The MeasuringQueues.java file has example code for measuring the cost of a PriorityQueue of 10,000 items. You can use this as a starting point for your code.

Note:
  • Your code should use System.currentTimeMillis() to measure the time;
  • See examples from the lecture code in part 10 (eg, in Mode.java ).

Report.

Write a brief report (Report.pdf)
  1. Give your analysis of the costs from part 1 using O(...) notation.
  2. Report on the results of your measurements, giving a table with the times taken for the three approaches for each value of n.
  3. Present your observations about the measurements and your conclusions.

Part 2: Decision Trees (Weight: 2/3)

In order to make sure your code could be recognised correctly by our automatic marking program, please notice the following strict rules when implementing your program:

  • You are NOT allowed to change any code provided in Template, including the headers, the names or parameters of any existing methods. You can add new methods, but do not change the names/parameters of any existing methods.
  • You are NOT allowed to delete any constructors in any classes. You are not allowed to change the headers of any existing constructors.
  • Do NOT rename any file provided in the Template.
  • Make sure the output of your program matches exactly the example given in the grey boxes below.

A Decision Tree is a structure for getting to a decision by following through a tree of yes/no questions.
  • The root node of the tree is the starting point.
  • Each internal node contains a question and two children:
    • the Yes child is the subtree to follow if the answer to the question is yes,
    • the No child is the subtree to follow if the answer to the question is no
  • A leaf node of the tree contains the decision that the tree will give if the answers to the questions led to this point in the tree.

For this question, you are to complete a program for using, growing, displaying, saving, and loading decision trees.

The program uses the DTNode class to represent the nodes in the tree. DTNode.java is already complete, and defines constructors, getters, setters, and a draw method.

The DecisionTree class is the main program. The setupGUI method is written for you and sets up buttons for the methods that you need to complete. The makeSampleTree method is also written for you and creates an initial decision tree in the theTree field.

You should complete the following methods

Core

  • runTree() should "run" the current decision tree by asking the user questions, following down the tree from the root to a leaf, according to the answers. When it gets to a leaf, it should print the answer.
    For your program, with the sample tree, runTree() should generate the interaction:

Make sure the output of your program matches the given text (not include *).

        Is it true: has whiskers (Y/N): *true*
        Is it true: bigger than person (Y/N): *false*
        The answer is: Cat

  • printTree() should print out the whole tree using indentation to show the tree structure. printTree() should be recursive - printing the text in a node, then (if it has children) printing the yes subtree and then the no subtree.
    Hint: write a "helper" method that has two arguments - the node to print and an indentation string (a String of spaces) to print before the text in the node.
    For your program, the sample tree should be printed as:

Make sure the output of your program matches the given text.
        has whiskers?
           bigger than person?
              Tiger
              Cat
           has legs?
              has trunk?
                 Elephant
                 Rhino
              Snake
For full marks, printTree() method should add "y:" and "n:" in front of each subtree:
        has whiskers?
           y:bigger than person?
              y:Tiger
              n:Cat
           n:has legs?
              y:has trunk?
                 y:Elephant
                 n:Rhino
              n:Snake

  • growTree() should play an interactive "game" with the user to grow the tree by adding new answers and new questions. For the sample tree, this would be adding new animals that the tree can classify and new questions to distinguish the new animals.

growTree() should start the same way as runTree by starting at the root of the tree, and working down, asking questions, until it gets to a leaf node.
When it is at a leaf, it prints the decision and asks the user if the decision is correct. If the decision was wrong, it
    • asks the user what the decision should have been,
    • asks for a property or question to distinguish the right decision from the wrong one
    • changes the text in the node to be the question
    • adds two new children (leaf nodes) to the node with the two decisions. For your program, with the sample tree, the interaction should be

Make sure the output of your program matches the given text (not include *).

	Is it true: has whiskers (Y/N): *true*
	Is it true: bigger than person (Y/N): *false*
	I think I know. Is it a Cat? *false*
	OK, what should the answer be? *Dog*
	Oh. I can't distinguish a Cat from a Dog
	Tell me something that's true for a Dog but not for a Cat?
	Property: *barks*
	Thank you! I've updated my decision tree.
Printing the tree would now produce:
        has whiskers?
           y:bigger than person?
              y:Tiger
              n:barks?
                 y:Dog
                 n:Cat
           n:has legs?
              y:has trunk?
                 y:Elephant
                 n:Rhino
              n:Snake

Completion

  • saveTree(String fname) should save the decision tree to a file in a way that it can be loaded back in.

  • loadTree(String fname) should load a decision tree from a file. It will need to remove the current tree, then construct a new tree based on the questions and answers in the file.

Both methods will need to be recursive.

The format of the sample tree in a text file for saving should be:
Question: has whiskers
Question: bigger than person
Answer: Tiger
Answer: Cat
Question: has legs
Question: has trunk
Answer: Elephant
Answer: Rhino
Answer: Snake

There are two text files provided with two different decision trees which you can use to test your load method.

Challenge:

  • drawTree() should display the tree in the graphics window so that each node is drawn as a rectangle with text in it, and there are lines connecting parent nodes and their child nodes. See a simple example below.

  • Yes/No questions are quite limited. Extend the program (including DTNode) to allow multiway questions, for example a node could contain the property: "the color of the stripes", which might lead to five different children, corresponding to the answers "black", "brown", "green", "yellow", and "orange". growTree() should be able to add a new answer to an existing question as well as add a new question.

dt.png