overview need to use daikon software the goal of this assignment is to
Search for question
Question
Overview
Need To use Daikon Software
The goal of this assignment is to learn about dynamic program invariant detection. Specifically, we
use a tool called Daikon to analyze the invariants in a 2-3-4 tree implementation.
The assignment consists of:
Installing and using Daikon to dynamically detect program invariants of a 2-3-4 tree
implementation.
Answering problem questions about Daikon and the detected invariants.
Resources
The Daikon Invariant Detector User Manual:
http://plse.cs.washington.edu/daikon/download/doc/daikon.html Links to an external site. (Sections
3.1 and 5.2-5.5 can be helpful for understanding the output.)
Install Daikon by following the instructions in the Section 2.2 at
https://plse.cs.washington.edu/daikon/download/doc/daikon.html#Installing-Daikon. Links to an
external site. Only perform steps 1 and 2 for the installation.
Download the following:
The source code for 2-3-4 tree, including tests:
Two ThreeFourIntSet.java Download TwoThreeFourIntSet.java
IntNode.java Download IntNode.java
Two ThreeFour Tester.java Download TwoThreeFourTester.java
Input test data for the 2-3-4 set test driver:
input1.txt Download input1.txt
input2.txt Download input2.txt
Background on 2-3-4 trees
The test program implements a multiset data structure for int values, backed by a 2-3-4 tree. If you
are familiar with B-Trees, a 2-3-4 is just a special case of a B-Tree with order 4. More information
is available at http://en.wikipedia.org/wiki/2-3-4_tree. Links to an external site.
A 2-3-4 tree is a self-balancing, n-way search tree. Like a binary tree, it consists of nodes with
children. Unlike a binary tree, each node may have more than just a left child and a right child; each
node of a 2-3-4 tree may have 2, 3, or 4 children. Instead of a single element at each non-root node,
a node may have anywhere from 1 to 3 elements contained within it. Elements are stored in a node
in sorted order.
A 2-node is just like a binary tree node with a single element x, with a left child L and right child R.
Each child node is the root of a subtree. All elements in L and its descendants are less than or equal
to x, and all elements in R and its descendants are greater than or equal to x. A 3-node contains 2
elements x1 and x2, with children left L, middle M, and right R. Elements in L's subtree are all less
than or equal to x1, elements in M's subtree are greater than or equal to x1 and less than or equal to
x2, and elements in R's subtree are all greater than or equal to x2. A 4-node contains 3 elements,
with a left, left-middle, right-middle, and right child, with similar bounds on the ranges of elements
in the child subtrees.
A 2-3-4 tree avoids imbalanced subtrees by moving elements and children between nodes in the tree
whenever an element is added or removed. The algorithms for adding and removing elements can be quite involved because of the large number of special cases. For this assignment, you should not
need to know how these rearranging algorithms work (although knowing how they work may be
helpful); you should be able to answer the questions by thinking of the tree simply as a balanced n-
way tree.
Setup
daikon.jar contains all of Daikon's executables. It is known to work under Oracle (Sun) JRE 1.7 and
above. If you are using the outdated Java 1.6 (a.k.a. Java 6) or earlier, please upgrade). Daikon
supports multiple programming languages, but we will only be using the Java front end to Daikon,
called Chicory.
Run Daikon using these three steps:
Compile the java files with the following command line (with -g to include debugging symbols):
javac -g -source 8 -target 8 IntNode.java TwoThreeFourIntSet.java TwoThreeFourTester.java
Have Chicory run the program (TwoThreeFourTester), and Chicory will record the values of
variables it observes into .dtrace.gz file. Chicory's arguments are the Java class name and the
command line arguments to be passed to that Java program. (Notes: The -- is important in the line
below to tell Chicory to pass the --file flag to TwoThreeFourTester instead of trying to parse it as a
Daikon flag. input#.txt will be a file specified by one of the problem statements below.)Execute the
command below in the directory where the java and input files are present.
java -cp "$DAIKONDIR/daikon.jar:$PWD" daikon.Chicory TwoThreeFour Tester
file=input#.txt
Have Daikon read all the observed values and analyze the data, looking for statistically significant
invariants. Daikon stores all these likely invariants into a binary format in a compressed .inv.gz file.
This command also stores them in a human-readable txt file that you will analyze and submit. For
TwoThreeFourTester, this step may take over a minute. (Note: output#.txt will be a file specified by
one of the problem statements below.)
java -cp "$DAIKONDIR/daikon.jar:$PWD" daikon.Daikon TwoThreeFourTester.dtrace.gz >
output#.txt
Problem questions
The test driver takes a script file as input containing instructions for adding (ADD), removing
(REMOVE), and querying the tree. When querying the tree, a script may check if a single element
is present (CONTAINS), find out how many elements are present (SIZE), or check how deep the
tree is based on distances from the root to childless leaves (HEIGHT). The contents of the tree may
also be printed (PRINT), with parentheses depicting the levels of the tree.
Run the test driver with the input1.txt as input. Save the output of PrintInvariants as output1.txt, and
examine the likely global invariants found for IntNode:::OBJECT, which hold at every entrance and
exit for every public method of the IntNode class. For each of the first 5 invariants, answer the
following questions:
In plain English, what does the likely invariant mean?
Is it really an invariant of the program, regardless of input? If not, explain why Daikon reports the
false invariant. (Note: You do not need to consider inputs that try to cause crashes, e.g., ignore "out
of memory" errors and integer overflow.)
Run the test driver with the file input2.txt as input, and save the output of PrintInvariants as
output2.txt. The invariants found for the different inputs will be slightly different. Each of the likely invariants below appears in output1.txt but was either changed or removed in output2.txt. For each
(there are a total of four), explain why Daikon found a different invariant or did not find the
invariant in output2.txt, and whether the invariant is true.
Two ThreeFour IntSet.containsRecursive(IntNode,
- node != null
int):::ENTER
Two ThreeFour IntSet.height():::EXIT
- return one of {1,2,3}
- this.root.elements[] elements < this.size
Two Three FourIntSet.remove(int):::EXIT
- return == true
For input2.txt, Daikon found a likely invariant for Two ThreeFourIntSet.height() that relates the
height of the tree to the size in number of elements in the tree. The likely invariant is actually
incorrect but even if it were, since a 2-3-4 tree is self balancing, a correct tighter bound exists
relating size and height. Daikon does not support looking for such an invariant. Describe a generic
invariant that can be added to Daikon that would allow it to report this height invariant. (Hint:
Think about lower and upper bounds that can be found for balanced binary trees such as a red-black
tree.)
Deliverables
Write the answers to the problem questions in either writeup.txt or writeup.pdf. Include your name
in this file.
Compress the following three files into a single .zip archive (call it daikon_lastname.zip) and
upload it.
writeup.txt or writeup.pdf
output1.txt
output2.txt
Note that one of the things we'll be looking for is that you pay attention to details. We will grade
aspects of the homework such as being clear in your writeup, and naming the submitted files
correctly. You may use any resource you wish in this assignment but you must list your
collaborators and cite all your sources. Failure to do so will result in a grade of 0.