Using the sequence below, perform blastn searches against the 'Core nucleotide database (core_nt)' database. For the 'Algorithm Parameters,' increase the 'Max target sequences' to 1000, uncheck the 'Short queries' adjustment and choose 'Word sizes' from 16 to 64 (a sparse sampling will be ok). For each query, note the number of 'Blast Hits' or matched sequences given in the first line of the 'Graphic Summary' on the results page. Explain the trend in your results based on what you know about the BLAST algorithm. Why didn't you get any matches with a word size of 64?
Results are presented in the figure below. The NCBI databases are updated daily so don't pay much attention to the actual numbers; your results will vary. To limit the search space, BLAST seeds productive areas by looking for matching words (full string exact matches). The shorter the word size, the greater the number of likely matches. Short word sizes provide a greater sensitivity in the algorithm, at the cost of longer run times. The query sequence is sixty letters. With a word size greater than the query size there are no initial seeds and thus no alignments.
A
blank alignment matrix is available to help you layout the solution.
Earlier this semester, fearing a too liberal interpretation of
Acts chapter 10, your roommate adopted a kosher diet. This has been challenging in the cafeteria, but you are beginning to become comfortable with the large supply of Hebrew National hot dogs in your dorm refrigerator. Recently, your roommate has been frequenting a particular local fast food restaurant. In challenging this behavior, your roommate assures you that their cheeseburgers are indeed kosher. Having access to the biochemistry laboratory, you decide to do a little investigation. From a used wrapper, you are able to isolate a small amount of protein and from that you generate a small section of peptide sequence.
Is this sequence from an animal with a cloven hoof that chews its cud? Can you be sure?
Without reading every line, your roommate looks to be correct. This sequence looks like it is from the light chain of skeletal myosin (a reasonable source for hamburger). While there are many organisms with this sequence, Bos taurus (cow, clean) is in the list, but Sus scrofa (pig, unclean) isn't. This alignment is produced using blastp and the peptide sequence to query the non-redundant (nr) protein database. The default query parameters will be just fine for this search.