Search for question
Question

you are required to write a standalone Python program that translates misspelled/rearranged English words into their correct English spelling! Requirements: Your program is required to do the following: 1. Read data from a file into a Python list (including the use of exception handling). 2. Make use of a binary search when navigating through a list in alphabetic order in order to make your program work more efficiently. Specifications: This assignment will focus on the use of loops and decision control, logical operations, functions (built-in and user-defined), strings and string functions, slices, lists and list functions, list comprehensions, dictionaries, FILE I/0, exception handling, as well as the efficient use of various data structures and algorithms within the Python programming language. Read the following paragraph as quickly as you can and see if you encounter any difficulties: Aoccdrnig to rscheearch at an Elingsh uinervtisy, it deosnt mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae hmuan biegns do not raed ervey lteter by itslef but the wrod as a wlohe. This was published as an example of a principle of human reading comprehension. If you keep the first letter and the last letter of a word in their correct positions, then scramble the letters in between, the word is still quite readable in the context of an accompanying paragraph. Your program will be required to "translate" (i.e. correct) strings of text containing misspelled words into their correct forms. Processing: You MUST code the following Python functions as specified. def load_dictionary(file_name) : + This function accepts the name of a file on disk that contains a list of 370,032 lowercase english dictionary words (one word per line) and loads the words from the file into a list and returns that list. This function must make use of exception handling. The dictionary file can be downloaded here: dictionary.txt NOTE/WARNING: While care was taken to ensure that words of a derogatory/vulgar/slang nature have been removed from the dictionary file (linked above), there may still contain 1 or more English words that some individuals may take offense to. If such a word is encountered, please note that this was completely unintentional and you are encouraged to forward a request to have the word removed if so desired. The original list was obtained from: https://github.com/dwyl/english-words/tree/master def_clean_up_words (word_list, text) : + This function accepts a list of english words (f function) the load_dictionary and a text string containing 1 or more incorrectly spelled words and corrects each word by searching for the correct version in the dictionary list before creating a new string containing the corrected text. Rules for misspelled/scrambled words are as follows: 1. The first and last characters are always correct and will be in their correct position. 2. Words of 3 characters or less will always be correctly spelled. 3. Words will NOT contain any embedded punctuation characters (hyphens, apostrophes, etc), but a punctionation mark may terminate a word (such as "!") in which case the terminating punctuation character is to be ignored when performing the comparison analysis, but is to be inserted into the corrected word once the analysis is complete. For example a word of "Eerkua!" would be converted to "Eureka!". There may only be at most 1 punctuation character terminating a word. 4. Words in the text string may be in either upper or lowercase. 5. The corrected word must contain every character in the scrambled word and case sensitivity of the scrambled word must be maintained in the corrected word. 6. If there are multiple possibilities for a correct word, then only the first word that matches the criteria above will be accepted. 7. Not every word in the text string may be incorrectly spelled/scrambled. In order to make your solution to this function more efficient, this function must make use of a binary search algorithm when searching for words by the first character. A binary search will be demontrated in class. Once every word in the text string has been corrected, this function must return BOTH the original string (scrambled) AND the corrected string. MAIN PROGRAM: # Your solution may ONLY use the python modules listed below # program: almain.py # author: danny abesdris # date: february 5, 2024 python main() program for PRG550 Winter 2024 Assignment #1 # version: 1.00 # purpose: import math import string import collections import re import copy # YOUR CODE BELOW... def load_dictionary(file_name) : your code here # end def def clean_up_words (word_list, text) your code here # end def def main(): passage1 = '''\ Aoccdrnig to rsceareh at an Elingsh uinervtisy, it deosnt \ mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt \ tihng is taht the frist and lsat ltteer is at the rghit pclae. The \ rset can be a toatl mses and you can sitll raed it wouthit a \ porbelm. Tihs is bcuseae hmuan biegns do not raed ervey lteter by itslef \ but the wrod as a wlohe! Nxet \ To! Etu? Brute? A! to! etu? brute? a! Aslo'' passage2 = """\ I cnat blveiee taht I can aulaclty uesdnatnrd tihs! The \ phaonmneel pweor of the hmuan mnid is qiute remarkable. I awlyas thought taht \ slpeling was ipmorantt too, but apparnelty tihs is not so. Hevower \ wihle it is rieealtvly esay to raed sroht wrods, it is not so esay wehn ridaneg \ legonr wdros. Aslo, msot wdros in Esinglh are sveen leertts lnog or leognr, and \ the mroe leretts terhe are in a wrod, the mroe dulcifift it bmeecos to cletrorcy \ infietdy them wehn the ltrtees are ragnearerd. Mroe cmoomn wrdos lkie blal \ and baer raimen mltsoy ungnchead and esay to rizocenge, whereas, lgneor and less \ cmoomn wdros, like pltuonuim and soulamitunes caghne saillbattunsy scuh taht \ rnciooitegn is srclceay pbsslioe. This atiibly smtes form a garet deal \ of enpicerexe rindaeg cretolcry slelepd wdros and only plopee who can \ adrealy raed pelictroinfy can do this tsak. Tihs tirck does not reeavl \ mcuh aoubt the pscroes of Innreaig to raed, it only ietaindcs that hhligy \ slielkd rrdeaes can omoercve moinr informieepcts wehn dnriiveg mnnaeig!' book = load_dictionary("dictionary.txt") mixed, good = clean_up_words (book, passage1) print("\ n=SCRAMBLED=== =======") print("original:") WC = 1 for w in mixed.split(' ') : if wc % 9 == 0: print() print(w, end=" ") WC += 1 print("\ n=CLEANED===: =======") print("cleaned: ") WC = 1 for w in good.split(' ') : if wc % 9 == 0: print() print(w, end =" ") WC += 1 mixed, good = clean_up_words(book, passage2) print("\ n=SCRAMBLED: ========") print("original:") WC = 1 for w in mixed.split(' ') : if wc % 9 == 0: print() print(w, end=" ") WC += 1 print("\ n=CLEANED==== ======") print("cleaned: ") WC = 1 for w in good.split(' ') : if wc % 9 == 0: print() if print (w, end =" ") WC += 1 print() name main() # The expected output is listed below. =SCRAMBLED= == _main__": original: Aoccdrnig to rsceareh at an Elingsh uinervtisy, it deosnt mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae hmuan biegns do not raed ervey lteter by itslef but the wrod as a wlohe! Nxet To! Etu? Brute? A! to! etu? brute? a! Aslo =CLEANED== ======= cleaned: According to research at an English university, it doesnt matter in what order the letters in a word are, the only important thing is that the first and last letter is at the right place. The rest can be a total mess and you can still read it without a problem. This is because human begins do not read every letter by itself but the word as a whole! Next To! Etu? Brute? A! to! etu? brute? a! Also =SCRAMBLED=== ==== original: I cnat blveiee taht I can aulaclty uesdnatnrd tihs! The phaonmneel pweor of the hmuan mnid is qiute remarkable. I awlyas thought taht slpeling was ipmorantt too, but apparnelty tihs is not so. Hevower wihle it is rieealtvly esay to raed sroht wrods, it is not so esay wehn ridaneg legonr wdros. Aslo, msot wdros in Esinglh are sveen leertts lnog or leognr, and the mroe leretts terhe are in a wrod, the mroe dulcifift it bmeecos to cletrorcy infietdy them wehn the ltrtees are ragnearerd. Mroe cmoomn wrdos lkie blal and baer raimen mltsoy ungnchead and esay to rizocenge, whereas, lgneor and less cmoomn wdros, like pltuonuim and soulamitunes caghne saillbattunsy scuh taht rnciooitegn is srclceay pbsslioe. This atiibly smtes form a garet deal of enpicerexe rindaeg cretolcry slelepd wdros and only plopee who can adrealy raed pelictroinfy can do this tsak. Tihs tirck does not reeavl mcuh aoubt the pscroes of Innreaig to raed, it only ietaindcs that hhligy slielkd rrdeaes can omoercve moinr informieepcts wehn dnriiveg mnnaeig!/n 19/02/2024, 03:44 bsearch.txt.txt.html fh = open ("dictionary.txt") data fh.read() mediatb.blob.core.windows.net/media//65d26e1c3e5c9b9de4dd263e/questions/binary_search_170829003... fh.close() words = data.split('\n') word = "aardvark" found = False # print (len (words)) high, low, mid searches 1 = = searches") len (words), 0, len (words)//2 while not found and low <= high : print("mid:",mid," word:", words[mid], " low: ", low, high: ", high) input ("press enter to continue...") if word == words [mid] : print("found correct starting point at index:", mid, print("word: ", words [mid]) found = True # search by first letter of word #pos = mid #while word [0] words [pos] [0] : # pos -= 1 #print("first word starting letter:", word [0], words [pos + 1]) == elif word > words [mid] : low = mid + 1 elif word < words [mid] : high = mid - 1 mid = (low + high) // 2 searches += 1 # end while Any Browser! || II after:", searches, has index:", pos + 1, || word is:", https://mediatb.blob.core.windows.net/media//65d26e1c3e5c9b9de4dd263e/questions/binary_search_1708290033278.txt 1/1