Search for question
Question

Overview Need To use Daikon Software The goal of this assignment is to learn about dynamic program invariant detection. Specifically, we use a tool called Daikon to analyze the invariants in a 2-3-4 tree implementation. The assignment consists of: Installing and using Daikon to dynamically detect program invariants of a 2-3-4 tree implementation. Answering problem questions about Daikon and the detected invariants. Resources The Daikon Invariant Detector User Manual: http://plse.cs.washington.edu/daikon/download/doc/daikon.html Links to an external site. (Sections 3.1 and 5.2-5.5 can be helpful for understanding the output.) Install Daikon by following the instructions in the Section 2.2 at https://plse.cs.washington.edu/daikon/download/doc/daikon.html#Installing-Daikon. Links to an external site. Only perform steps 1 and 2 for the installation. Download the following: The source code for 2-3-4 tree, including tests: Two ThreeFourIntSet.java Download TwoThreeFourIntSet.java IntNode.java Download IntNode.java Two ThreeFour Tester.java Download TwoThreeFourTester.java Input test data for the 2-3-4 set test driver: input1.txt Download input1.txt input2.txt Download input2.txt Background on 2-3-4 trees The test program implements a multiset data structure for int values, backed by a 2-3-4 tree. If you are familiar with B-Trees, a 2-3-4 is just a special case of a B-Tree with order 4. More information is available at http://en.wikipedia.org/wiki/2-3-4_tree. Links to an external site. A 2-3-4 tree is a self-balancing, n-way search tree. Like a binary tree, it consists of nodes with children. Unlike a binary tree, each node may have more than just a left child and a right child; each node of a 2-3-4 tree may have 2, 3, or 4 children. Instead of a single element at each non-root node, a node may have anywhere from 1 to 3 elements contained within it. Elements are stored in a node in sorted order. A 2-node is just like a binary tree node with a single element x, with a left child L and right child R. Each child node is the root of a subtree. All elements in L and its descendants are less than or equal to x, and all elements in R and its descendants are greater than or equal to x. A 3-node contains 2 elements x1 and x2, with children left L, middle M, and right R. Elements in L's subtree are all less than or equal to x1, elements in M's subtree are greater than or equal to x1 and less than or equal to x2, and elements in R's subtree are all greater than or equal to x2. A 4-node contains 3 elements, with a left, left-middle, right-middle, and right child, with similar bounds on the ranges of elements in the child subtrees. A 2-3-4 tree avoids imbalanced subtrees by moving elements and children between nodes in the tree whenever an element is added or removed. The algorithms for adding and removing elements can be quite involved because of the large number of special cases. For this assignment, you should not need to know how these rearranging algorithms work (although knowing how they work may be helpful); you should be able to answer the questions by thinking of the tree simply as a balanced n- way tree. Setup daikon.jar contains all of Daikon's executables. It is known to work under Oracle (Sun) JRE 1.7 and above. If you are using the outdated Java 1.6 (a.k.a. Java 6) or earlier, please upgrade). Daikon supports multiple programming languages, but we will only be using the Java front end to Daikon, called Chicory. Run Daikon using these three steps: Compile the java files with the following command line (with -g to include debugging symbols): javac -g -source 8 -target 8 IntNode.java TwoThreeFourIntSet.java TwoThreeFourTester.java Have Chicory run the program (TwoThreeFourTester), and Chicory will record the values of variables it observes into .dtrace.gz file. Chicory's arguments are the Java class name and the command line arguments to be passed to that Java program. (Notes: The -- is important in the line below to tell Chicory to pass the --file flag to TwoThreeFourTester instead of trying to parse it as a Daikon flag. input#.txt will be a file specified by one of the problem statements below.)Execute the command below in the directory where the java and input files are present. java -cp "$DAIKONDIR/daikon.jar:$PWD" daikon.Chicory TwoThreeFour Tester file=input#.txt Have Daikon read all the observed values and analyze the data, looking for statistically significant invariants. Daikon stores all these likely invariants into a binary format in a compressed .inv.gz file. This command also stores them in a human-readable txt file that you will analyze and submit. For TwoThreeFourTester, this step may take over a minute. (Note: output#.txt will be a file specified by one of the problem statements below.) java -cp "$DAIKONDIR/daikon.jar:$PWD" daikon.Daikon TwoThreeFourTester.dtrace.gz > output#.txt Problem questions The test driver takes a script file as input containing instructions for adding (ADD), removing (REMOVE), and querying the tree. When querying the tree, a script may check if a single element is present (CONTAINS), find out how many elements are present (SIZE), or check how deep the tree is based on distances from the root to childless leaves (HEIGHT). The contents of the tree may also be printed (PRINT), with parentheses depicting the levels of the tree. Run the test driver with the input1.txt as input. Save the output of PrintInvariants as output1.txt, and examine the likely global invariants found for IntNode:::OBJECT, which hold at every entrance and exit for every public method of the IntNode class. For each of the first 5 invariants, answer the following questions: In plain English, what does the likely invariant mean? Is it really an invariant of the program, regardless of input? If not, explain why Daikon reports the false invariant. (Note: You do not need to consider inputs that try to cause crashes, e.g., ignore "out of memory" errors and integer overflow.) Run the test driver with the file input2.txt as input, and save the output of PrintInvariants as output2.txt. The invariants found for the different inputs will be slightly different. Each of the likely invariants below appears in output1.txt but was either changed or removed in output2.txt. For each (there are a total of four), explain why Daikon found a different invariant or did not find the invariant in output2.txt, and whether the invariant is true. Two ThreeFour IntSet.containsRecursive(IntNode, - node != null int):::ENTER Two ThreeFour IntSet.height():::EXIT - return one of {1,2,3} - this.root.elements[] elements < this.size Two Three FourIntSet.remove(int):::EXIT - return == true For input2.txt, Daikon found a likely invariant for Two ThreeFourIntSet.height() that relates the height of the tree to the size in number of elements in the tree. The likely invariant is actually incorrect but even if it were, since a 2-3-4 tree is self balancing, a correct tighter bound exists relating size and height. Daikon does not support looking for such an invariant. Describe a generic invariant that can be added to Daikon that would allow it to report this height invariant. (Hint: Think about lower and upper bounds that can be found for balanced binary trees such as a red-black tree.) Deliverables Write the answers to the problem questions in either writeup.txt or writeup.pdf. Include your name in this file. Compress the following three files into a single .zip archive (call it daikon_lastname.zip) and upload it. writeup.txt or writeup.pdf output1.txt output2.txt Note that one of the things we'll be looking for is that you pay attention to details. We will grade aspects of the homework such as being clear in your writeup, and naming the submitted files correctly. You may use any resource you wish in this assignment but you must list your collaborators and cite all your sources. Failure to do so will result in a grade of 0.