Free Republic
Browse · Search
News/Activism
Topics · Post Article

To: jennyp
Together, the two deleted segments harbour 1,243 non-coding sequences conserved between humans and rodents (more than 100 base pairs, 70% identity).

The point is that 1,243 sequences, each over 100 base pairs with 70% identity is a humongus conserved "area". There is something called expectation when sequences are compared by BLAST. I'm pretty sure the expectation for this is close to zero. This is what I got when I took a 110 base sequence from mus musculus cytochrome and compared it to homo sapiens.

 Score = 38.2 bits (19), Expect = 3.9
 Identities = 19/19 (100%)
 Strand = Plus / Plus

                                 
Query: 86     atgggccttcttgctcagt 104
              |||||||||||||||||||
Sbjct: 223143 atgggccttcttgctcagt 223161

The expectation is low but the percent identities are 100%

68 posted on 02/24/2005 2:03:39 AM PST by AndrewC (Darwinian logic -- It is just-so if it is just-so)
[ Post Reply | Private Reply | To 67 | View Replies ]


To: AndrewC; Nebullis; Right Wing Professor
The point is that 1,243 sequences, each over 100 base pairs with 70% identity is a humongus conserved "area". There is something called expectation when sequences are compared by BLAST. I'm pretty sure the expectation for this is close to zero. This is what I got when I took a 110 base sequence from mus musculus cytochrome and compared it to homo sapiens. ... The expectation is low but the percent identities are 100%
There are several errors with your logic.

1) From the BLAST FAQ page here's what they say about what "Expect" means:

Q: What is the Expect (E) value?

The Expect value (E) is a parameter that describes the number of hits one can "expect" to see just by chance when searching a database of a particular size. It decreases exponentially with the Score (S) that is assigned to a match between two sequences. Essentially, the E value describes the random background noise that exists for matches between sequences. For example, an E value of 1 assigned to a hit can be interpreted as meaning that in a database of the current size one might expect to see 1 match with a similar score simply by chance. This means that the lower the E-value, or the closer it is to "0" the more "significant" the match is. However, keep in mind that searches with short sequences, can be virtually indentical and have relatively high EValue. This is because the calculation of the E-value also takes into account the length of the Query sequence. This is because shorter sequences have a high probability of occurring in the database purely by chance. For more details please see the calculations in the BLAST Course.

The Expect value can also be used as a convenient way to create a significance threshold for reporting results. You can change the Expect value threshold on most main BLAST search pages. When the Expect value is increased from the default value of 10, a larger list with more low-scoring hits can be reported.

The E value has nothing at all to do with which species are being compared! It makes no judgements about how close two species' sequences "should" be to each other.

2) The cytochrome c gene, being an essential gene, should be highly conserved. I would expect a non-functional DNA stretch to be less homologous than the cytochrome c gene! And in fact it is: The knocked-out sequences were 70% homologous - a full 30 percent less than your example! Your example contradicts your argument.

3) Your claim was that the mice gene deserts were much more highly conserved WRT the homologous human gene deserts than we should expect if they were truly junk. You're making a judgement based on the average overall genetic distance between mice & man, not just one gene (I hope). And we still don't know what the "official" overall % figure is for that.

Pinging the only two people I know of who might know the real figures...

69 posted on 02/24/2005 1:33:30 PM PST by jennyp (WHAT I'M READING NOW: Debugging Windows Programs by McKay & Woodring)
[ Post Reply | Private Reply | To 68 | View Replies ]

Free Republic
Browse · Search
News/Activism
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson