Search for question
Question

CS2201 Spring 2024 Assignment No. 9 Honor Statement: By submitting this homework under your personal Brightspace account, you attest that you have neither given nor received unauthorized aid concerning this homework. That includes not searching the web for solutions nor using LLM AI tools. You further acknowledge the instructors are the copyright owners of this HW. Any posting/uploading of HW questions for distribution (e.g., GitHub, Chegg) will be considered an honor code violation (even after finishing this class) and submitted to the honor council. Reminder: this assignment is less than 4% of your grade... not worth the risk of getting caught cheating. Purpose: This project is meant to give you experience in creating and using a tree-like ADT to store words. In particular, you will implement a Trie ADT, as described in class. A trie is a kind of tree that stores sequences of data in such a way that makes searching efficient. The word trie comes from reTRIEval, but is pronounced "TRY" to distinguish it from other kinds of trees. Assignment: Implement a Trie ADT that will hold a collection of words. Requirements: 1. The files to be submitted include: Trie.java, and if appropriate TrieNode.java. Also submit a program TrieTest.java that contains JUnit test code that thoroughly exercises all functionality of your Trie. Also submit a README document (a .txt text file or a .doc Word document) that answers the questions listed later in this document. The functional specification for the Trie ADT is given below. 2. Your implementation must use a linked-tree structure modeled on the diagram at the end of this document. There are two options that you can choose between, and those are described later. 3. All class methods should be properly documented with Javadoc header comments. Each code file should start with the standard header comments, including an honor statement. Functional Specifications: Trie: This class will provide support for holding a collection of words (in the form of a trie). must support the following public operations: Trie() void insert (String word) void loadFromFile (String filename) boolean isWord (String word) boolean isPrefix (String str) String toString ( ) int wordCount ( ) Trie clone () boolean equals (Object other) The default constructor Insert the parameter word into the Trie Insert all words in the specified file into the Trie. Throws a FileNotFoundException exception if the specified file cannot be opened (note: that exception will be thrown when you attempt to create a Scanner object on a File object for a nonexistent file). Returns true if word is in the Trie, else returns false Returns true if str is a prefix of a word in the Trie, else returns false Return a string of all words in the Trie, in dictionary/alphabetical order. Each word in the string is followed immediately by a newline character '\n'. Returns a count all the words in the Trie Return a clone/copy of the Trie. Determine equality of two Tries. Two Tries are equals if they contain the exact same collection of words. You are required to determine equality by comparing the two Tries and not by converting them to strings and performing string comparison. As discussed in class this assignment can be greatly simplified by creating a TrieNode class, just as we created a BinNode class for our Binary Search Trees. Here is the specification of the TrieNode class. TrieNode: This class will provide support for one node in the trie tree, which equates to a single letter. It should support the following public operations: TrieNode (char letter, boolean flag) void insert (String str) boolean isWord (String str) boolean isPrefix (String str) String toString (String pre) int wordCount ( ) TrieNode clone ( ) boolean equals (Object other) The alternate constructor. Takes a character for the node, and a boolean flag indicating whether it represents the end of a word. Note, there is no default constructor. Insert str starting with the given TrieNode. Create new TrieNodes as needed and be sure to set the boolean flag on the last node to true. Returns true if str is in the sub-Trie starting at the given TrieNode, else returns false Returns true if str is a prefix of a word in the sub-Trie starting at the given TrieNode, else returns false Returns a string containing all words starting at the given TrieNode. Each word is followed immediately by a newline character. The parameter string pre stores the letters of all previous TrieNode objects up to the node currently being processed. That is the letters of the TrieNodes on the path from the root to the current node. Returns a count all the nodes in the sub-Trie that end a word. Return a clone/copy of the TrieNode (and the entire sub-Trie) Determine equality of two TrieNodes. Two TrieNodes are equals if they have the same letter and same end-of-word flag, and all their children TrieNodes are equal. You are required to determine equality by comparing the two TrieNodes and not by converting them to strings and performing string comparison. TrieTest.java: This program should contain JUnit test code that fully exercises all functionality of your Trie ADT. Note: 9% of your grade will be based on the quality of your testing. README: Answer the following questions in your README document (a .txt text file or a .docx Word document). 1. List your name and email address. 2. After reviewing this spec and the provided files, please estimate/report how long you think it will take you to complete it. 3. How many hours did you actually spend total on this assignment? 4. Did you access any other reference material other than those provided by the instructor and the class text? Please list them. 5. What did you enjoy about this assignment? What did you dislike? What could we have done better? 6. Was this project description lacking in any manner? Please be specific. Here are some programming notes: 1. The trie will be used to implement a spelling dictionary, or lexicon. Each node in the trie stores one letter, a flag indicating if that letter ends a word, as well as collection of references to subsequent nodes that represent subsequent letters in a word. This collection can be created in one of two ways (you are free to choose between these two options and use the one that makes the most sense to you): a) We could use an array of 26 references to other nodes, one for each possible next letter. The diagram on the last page of the specification uses this strategy. You can easily compute the array index of a letter by using subtraction on their ASCII values. For example, to find the index of the letter stored at position i in the string variable 'word' you can use the following code: int index = word.charAt(i)-'a'; Note that the statement use a character constant (inside single quotes) and not the ASCII value of the character 'a'- using the ASCII value would be considered poor style. b) Or we could use a Map from the Java collections framework that maps a character to a reference to another node. The map would contain one entry for each next letter that actually appears in some word that was inserted. Remember that the string returned by the toString method must be in sorted order. 2. The Trie and TrieNode classes are not generic classes. They store only words made up of lower-case alphabetic characters. As such, the class declarations should not declare any generic types inside angle brackets (as we did for the BST/BinNode classes). 3. As with the BST/BinNode classes covered in lecture, the methods of the Trie class (with the exception of loadFromFile, clone, and equals) are simple calls to the appropriate TrieNode methods on the root node. Many of the TrieNode methods can be implemented as simple recursive functions. 4. You are not being provided with any starter files for this assignment. By this time in the semester, you should be able to create an IntelliJ project (or whatever IDE you use) and associated source files from scratch. Tell IntelliJ that you want to create a new project, and then specify a name & location. When you go to add new files to the project, right- click on the src folder in the Project pane (on the left), and select New, and then select that you want to add a Java class. Provide a class name and the file will be created for you. 5. Each file you turn in should have the standard block comments at the top of the file, including an honor statement. Be sure to use good programming style. All methods should be documented with appropriate Javadoc comments. 6. You are not allowed to change the public interface of the Trie class. Since a user of the Trie class will not know about the TrieNode class nor can they access your private TrieNode data, you can change the TrieNode class public interface if desired (though it is not recommended). 7. An important difference between a Trie and a BST is that an empty Trie will still contain a root node, whereas an empty BST simply contains a root reference that is nu11. Hint: think of this root node as containing references to the roots for all possible 26 first letters. This root node of the trie is special, in that it stores a space as its character value – all other nodes will contain a lowercase letter. 8. You can assume all words will contain only lower-case letters. To make your program more robust, you could change letters to lower case as you encounter them. The Character.toLowerCase() function will do that for you. 9. The diagram below shows a trie into which the following words have been added: "for fork top tops topsy toss". Note that although "to" is a valid word, it has not yet been added to the lexicon, and so the boolean flag of the 'o' node is currently set to false. If we were to add "to" to the lexicon, this field would be set to true. Note that this diagram is using the array version of TrieNode class (see note 1-a above). 10. In the diagram below, as in your program, nodes should be created only as they are needed. Initially, the lexicon contains only the root. As new words are added, new nodes are created. In particular, the array of 26 references for the children should be initialized to null for each node if you are using the array version. If using the map version, the map would be empty for a newly initialized node. 11. As you design your solutions for insert, isWord, and isPrefix, it is helpful to simulate them operating on the diagram that is included below. All operations should start at the root node. This usually helps clarify what needs to happen as you move down the tree and delete letters from your strings. Choose your recursive base case wisely, as a good choice results in more simple code than a poor choice. 12. When clone() is called on a TrieNode, its job is to make a deep copy of the structure, which means making new subtrees. Recursion makes this trivial. ' that is stored in the 13. When creating a string of all the words in the Trie, do not include the blank/space character' root node. This requires a simple conditional in the TrieNode toString() method. 14. For loading all the words from a file, create a Scanner object on a File object in the same manner that files were opened in Projects 5-8. Let the Scanner object throw an exception if the file does not exist. You may assume the input file only contains words made up of lowercase alphabetic characters. 15. A word is considered a prefix of itself. 16. Carefully consider the base case of your recursive functions that walk the TrieNodes – using the empty string can make your logic cleaner/easier. 17. The empty string is not a word, but the empty string is a prefix of all words and of itself. One should never insert the empty string into the Trie. 18. A prefix is any substring of characters that appear at the start of a word. The valid prefixes of "word" are "wo", "wor", and "word". 6699 "W", Here is a diagram of a Trie for which the following words have been inserted: for fork top tops topsy toss ... false 30 This node at the top is the root node of the Trie. It basically contains 26 pointers to the roots of the Tries for each letter. This is the array version. Об 'f' false Оо Оо O. 20 false FO r true Co 'k' 30 30 true 25 30 't' false Оо Оо 'p' true ... 18 Со Оо 15 20 true false 18 24 25 20- 20 30 30 : 'y' true Оо 'S' false 20 оо 18 ... ON 'S' true 30