Search for question
Question

Part III: Maximum parsimony methods

1.

2.

3.

4.

What is the key assumption of maximum parsimony methods?

How does this differ from distance matrix methods?

What are the advantages of maximum parsimony methods?

What are the disadvantages of maximum parsimony methods?

Fig: 1


Most Viewed Questions Of Bioinformatics

Part I: Learning about molecular phylogenies 1. What is the basic assumption underlying a molecular phylogeny? Why must we distinguish between gene trees and species trees? 2. 3. Why don't genes always evolve by a series of bifurcations (i.e., by a series of single base changes)? 4. What are the four steps to constructing a molecular phylogeny? 5. What is an orthologous sequence? 6. What is a paralogous sequence? 7. What is a xenologous sequence? 8. Which type of sequences should you use for a species phylogeny? 9. What is the difference between multiple sequence alignments to discover motifs, etc., vs for constructing phylogenies? 10. Why is Clustal W not a very good choice for constructing species phylogenies?/n10. Why is Clustal W not a very good choice for constructing species phylogenies? 11. Please use the supplemental material on the links page to answer the following questions. What is a phylogenetic tree composed of? What is the difference between rooted and unrooted phylogenetic trees? What are the two major groups of analyses used to examine phylogenetic relationships? 0 0 0 What is a paraphyletic grouping? What happens if a multiple alignment is poor? What is the best way to deal with parts of an alignment that are uncertain due to gaps? What sorts of phylogenies are best constructed using DNA sequence alignments? What sorts of phylogenies are best constructed using protein alignments? What sorts of phylogenies are best constructed using ribosomal RNA sequence alignments? 0 0 0 0 0 What is a homoplasy? 0 Why can't we simply construct all possible trees, score each one, then pick the one with the best score? 0


Part III: Maximum parsimony methods 1. 2. 3. 4. What is the key assumption of maximum parsimony methods? How does this differ from distance matrix methods? What are the advantages of maximum parsimony methods? What are the disadvantages of maximum parsimony methods?


Part 1: Smith-Waterman Algorithm Instructions: 1. Copy the "STS protein query sequence" from the week 3 links page (This is the one letter code for the protein encoded by the Resveratrol synthase gene, the gene that catalyzes the final step in the resveratrol synthesis pathway). Be sure to include the ">STS query" on the first line, or the program won't accept it. 2. Go to https://www.ebi.ac.uk/Tools/sss/ and choose SSEARCH, then "protein". This will do a Smith-Waterman local alignment on your protein sequence. 3. Choose the UniProtKB/TrEMBL database (under Step 1 on the page) 4. Paste the STS protein query sequence into the paste window 5. Choose SSEARCH under step 3, then click "More options". Here you will find a number of parameters including the substitution matrix. Search UniProt using three different substitution matrices: • BLOSUM 50 • PAM 120 • PAM 250 Be patient, the calculation will take a while. Questions: 1. What is the name and score of the best hit for each matrix? 2. What is the e-value of the best hit for each matrix? 3. Why do you think the results may have changed? 4. What do e-values mean and how do we interpret them?


Part II: Distance matrix methods 1. Answer the following questions: What is the general approach used by distance matrix methods to construct a phylogeny? a. b. 2. 3. 4. a. 5. a. b. 6. What are the main differences between UPGMA and neighbor-joining methods? Take your protein sequences from the links page and import them into Mega. Align them via Clustal W and save the alignment. Use your alignment to construct UGMA and Neighbor-Joining Trees. What are the differences and similarities between the trees? Repeat the above with an alignment based on MUSCLE instead of ClustalW. How does this change the results? Why do you think the results are different? Include labeled screen shots of your different trees.


Part V: Tree evaluation 1. 2. 3. 4. 5. What are the three basic ways to resample the data for tree-building? What is jackknife resampling? What is bootstrap resampling? How does it differ from jackknife resampling? Recreate one of your ML trees except use Bootstrap Resampling as a method of tree evaluation. 6. How did your tree change?


3. RNAFold also uses partition-function methods (AKA thermodynamic ensemble methods). What is a partition function and how is it related to free energy? What are the advantages and disadvantages of calculating partition functions?


1. What is the difference between energy and free energy and what does it have to do with RNA structure? What structural elements are used in MFE algorithms to compute the free energy of an RNA secondary structure?


2. An MFE algorithm tries to find the secondary structure with the lowest free energy. What is the point of doing this? What is the significance of the MFE structure?


Part II: Needleman-Wunsch Algorithm Instructions: 1. Go to https://www.ebi.ac.uk/Tools/sss/ and select GGSEARCH then protein. This will do a Needleman-Wunsch global alignment on your protein sequence. 2. Choose UniProtKB/TrEMBL as your database (step 1) 3. Paste in your STS protein query sequence (step 2) 4. Click More options... on step 3 and choose BLOSUM50 as your scoring matrix and then click Submit. Questions: 1. What is the name and score of your top hit? 2. How do these results differ from what you got with the Smith-Waterman algorithm (SSEARCH)? 3. Why do you think the results differed?


Recombinant DNA You have received a PCR product from an unknown organism that caused fever and a strange rash in young child. The family of the child has two kittens, and the physician treating the child suspects an infection with Bartonella. Strangely, the growth characteristics do not match those of known Bartonella species. The clinical lab has amplified an 825 base pair fragment of the unknown organism's genome and has asked you to use the PCR product to investigate the identity of the unknown bacteria. Once you received the DNA you decided to have the PCR product sent out for sequencing. Bioinformatics 1) Using a Blastn search determine the top 3 most similar DNA sequences and using Blastx determine the top 3 most similar proteins. What gene did the clinical lab PCR amplify? • What is the function of the protein? • Based on the results from this gene what organism did you isolated? And what species is your organism most similar to?