Posted on 06/04/2004 8:08:18 AM PDT by Michael_Michaelangelo
It is not often that the audience at a scientific meeting gasps in amazement during a talk. But that is what happened recently when researchers revealed that they had deleted huge chunks of the genome of mice without it making any discernable difference to the animals.
The result is totally unexpected because the deleted sequences included so-called "conserved regions" thought to have important functions.
All DNA tends to acquire random mutations, but if these occur in a region that has an important function, individuals will not survive. Key sequences should thus remain virtually unchanged, even between species. So by comparing the genomes of different species and looking for regions that are conserved, geneticists hope to pick out those that have an important function.
It was assumed that most conserved sequences would consist of genes coding for proteins. But an unexpected finding when the human and mouse genomes were compared was that there are actually more conserved sequences within the deserts of junk DNA, which does not code for proteins.
The thinking has been that these conserved, non-coding sequences must, like genes, be there for a reason. And indeed, one group has shown that some conserved regions seem to affect the expression of nearby genes.
To find out the function of some of these highly conserved non-protein-coding regions in mammals, Edward Rubin's team at the Lawrence Berkeley National Laboratory in California deleted two huge regions of junk DNA from mice containing nearly 1000 highly conserved sequences shared between human and mice.
One of the chunks was 1.6 million DNA bases long, the other one was over 800,000 bases long. The researchers expected the mice to exhibit various problems as a result of the deletions.
Yet the mice were virtually indistinguishable from normal mice in every characteristic they measured, including growth, metabolic functions, lifespan and overall development. "We were quite amazed," says Rubin, who presented the findings at a recent meeting of the Cold Spring Harbor Laboratory in New York.
He thinks it is pretty clear that these sequences have no major role in growth and development. "There has been a circular argument that if it's conserved it has activity."
(Excerpt) Read more at newscientist.com ...
Well first, they are not genes, since they do not code for proteins. Secondly, although there may be no pressure to be deleted, there should certainly be no pressure to correct any mutations occurring on them. They are pristine. No changes. And the analogy is "good" in that programs don't mutate on their own. Darwinian evolution requires mutation.
> They are pristine. No changes.
I do not see that in the article.
> And the analogy is "good" in that programs don't mutate on their own.
Some do. Those meant to emulate the genetic process mutate quite nicely on their own.
Evolution doesn't force conclusions; it directs speculation, which in turn suggests lines of research.
The question is, what kind of research does ID suggest, since by definition, anything at all can be fit into the paradigm of design. If a feature is adaptive, it must have been designed, if it is neutral, it is part of the designer's toolbox; if it is maladaptive it is the result of The Fall.
What questions could ID possibly ask that would lead to different lines of research from mainstream science?
And mutation most certainly does occur. But you are drawing conclusions from a tiny fraction of a percentage of "neutral" code which has not been affected by replication errors.
In the absense of a testable hypothesis, you cannot draw conclusions.
In order to make even a wild guess about the probability of this being adventitious, you would need to draw up a table of segment lengths of conserved code and see if the lengths can be placed in a normal distribution.
They are talking about these regions
Analysis Uncovers Critical Stretches of Human Genome
Hundreds of stretches of DNA may be so critical to life's machinery that they have been ultra-conserved throughout hundreds of millions of years of evolution. Researchers have found precisely the same sequences in the genomes of humans, rats, and mice; sequences that are 95 to 99 percent identical to these can be found in the chicken and dog genomes, as well.
Those meant to emulate the genetic process mutate quite nicely on their own.
No, the data mutates, the actual programs remain the same.
That has been done. Have you ever used BLAST?
PSI-BLAST is an iterative program to search a database for proteins with distant similarity to a query sequence. We investigated over a dozen modifications to the methods used in PSI-BLAST, with the goal of improving accuracy in finding true positive matches. To evaluate performance we used a set of 103 queries for which the true positives in yeast had been annotated by human experts, and a popular measure of retrieval accuracy (ROC) that can be normalized to take on values between 0 (worst) and 1 (best). The modifications we consider novel improve the ROC score from 0.758 +/- 0.005 to 0.895 +/- 0.003. This does not include the benefits from four modifications we included in the 'baseline' version, even though they were not implemented in PSI-BLAST version 2.0. The improvement in accuracy was confirmed on a small second test set. This test involved analyzing three protein families with curated lists of true positives from the non-redundant protein database. The modification that accounts for the majority of the improvement is the use, for each database sequence, of a position-specific scoring system tuned to that sequence's amino acid composition. The use of composition-based statistics is particularly beneficial for large-scale automated applications of PSI-BLAST.
Could you translate that into English and show me where to find the spread of conserved code lengths in simple chart form?
Click on the type of query you would like to do. Then put in the sequence you would like to check. After the process is completed, a listing of matches found in the searched databases will be given to you. In that data is a number describing the probability of finding a random sequence in the database. Here is one for a 300 base string.
The probability is 10-167 with 0 mutations.
>gi|5729841|ref|NM_006708.1| Homo sapiens glyoxalase I (GLO1), mRNA Length = 1993 Score = 595 bits (300), Expect = e-167 Identities = 300/300 (100%) Strand = Plus / Plus Query: 1 ctagttaaggcggcacagggccgaggcgtagtgtgggtgactcctccgttccttgggtcc 60 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1 ctagttaaggcggcacagggccgaggcgtagtgtgggtgactcctccgttccttgggtcc 60 Query: 61 cgtcgtctgtgatactgcagttcagccatggcagaaccgcagcccccgtccggcggcctc 120 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 61 cgtcgtctgtgatactgcagttcagccatggcagaaccgcagcccccgtccggcggcctc 120 Query: 121 acggacgaggccgccctcagttgctgctccgacgcggaccccagtaccaaggattttcta 180 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 121 acggacgaggccgccctcagttgctgctccgacgcggaccccagtaccaaggattttcta 180 Query: 181 ttgcagcagaccatgctacgagtgaaggatcctaagaagtcactggatttttatactaga 240 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 181 ttgcagcagaccatgctacgagtgaaggatcctaagaagtcactggatttttatactaga 240 Query: 241 gttcttggaatgacgctaatccaaaaatgtgattttcccattatgaagttttcactctac 300 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 241 gttcttggaatgacgctaatccaaaaatgtgattttcccattatgaagttttcactctac 300
Here is the result for the mouse compared to human. 10-26 with 27 mutations.
>gi|26327652|dbj|AK031832.1| Mus musculus adult male medulla oblongata cDNA, RIKEN full-length enriched library, clone:6330414G20 product:GLYOXALASE I homolog [Homo sapiens], full insert sequence Length = 959 Score = 127 bits (64), Expect = 1e-26 Identities = 145/172 (84%) Strand = Plus / Plus Query: 83 cagccatggcagaaccgcagcccccgtccggcggcctcacggacgaggccgccctcagtt 142 ||||||||||||| || ||||| ||||| | |||||||| || ||| |||| |||| | Sbjct: 46 cagccatggcagagccacagccggcgtccagtggcctcactgatgagaccgctttcagct 105 Query: 143 gctgctccgacgcggaccccagtaccaaggattttctattgcagcagaccatgctacgag 202 |||||||||| | ||||| || ||||||||||||||| ||||||| || |||||| || Sbjct: 106 gctgctccgatccagaccctagcaccaaggattttctactgcagcaaacgatgctaagaa 165 Query: 203 tgaaggatcctaagaagtcactggatttttatactagagttcttggaatgac 254 | ||||||||||||||||| |||||||||||||| || ||||||||| |||| Sbjct: 166 ttaaggatcctaagaagtccctggatttttatacgagggttcttggactgac 217
I don't think you are responding to my question.
Here's my question in another form:
The article speaks of "One of the chunks was 1.6 million DNA bases long, the other one was over 800,000 bases long."
There is an implication that there are other chunks of varying length. Presumably there are chunks of length 1, 2, 3, 4, 5, and so forth. Perhaps ther are constraints limiting the lengths to multiples of two or four or whatever, but there must be conserved chunks of various lengths.
So what I am asking is, what is the distribution of lengths? How many 1s, how many twos, and so forth.
I'm responding, but don't seem to be hitting a "sweet" spot. Those are probablilities for finding matches of that length with those characteristics. In other words, pristine DNA chunks in man and mouse of length 300 with no mutations would be expected to "happen" in one out of 100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 "times".(if I got the number of zeroes right)
Your method of calculating probabilities makes hidden assumptions about the mechanisms involved. Specifically it assumes there are no mechanisms for conserving code, or that there are no currently unknown processes involved.
Again, I am asking for data, not presuppositions. What is the actual distribution of conserved code lengths?
It makes no such assumptions. I did not dream this method up. Biologists throughout the world use it to make judgements about even evolution.
This is simply insane. When you use the word "happen" you are making assumptions about mechanisms, i.e., you are every position and every copy error is a roll of the dice. You are, in effect asserting that there are no mechanisms for conserving sequences.
This assumption could be put to the test by looking at actual data. Now since we know that several long conserved sequences exist, you "no mechanism" hypothesis is highly improbable. An honest researcher would begin devoting energy towards finding one or more mechanisms.
No I am making no assumptions about mechanisms. I a merely describing the fact that the sequences are conserved. They are pristine. Got it? Pristine!!!! That means they are "preserved" 100%. How can we describe the chances of them being conserved by "accident"? We use tools such as BLAST. The results show that "accident" is not a mechanism.
The reason I use the word insane is that things that are not accidental are necessarily the result of a regular process or mechanism, yet you deny assuming a mechanism. The fact that a mechanism is unknown or not understood does not make it nonexistent. It makes it an opportunity for science.
Well, we're getting somewhere. I have only ruled out "accident". Which is what you seem to say I couldn't do. ---->In order to make even a wild guess about the probability of this being adventitious,
RMNS is ruled out as a mechanism, because removing all of that genome under question had no discernable effect. There is nothing to select(thus the gasps).
Please FREEPMAIL me if you want on, off, or alter the "Gods, Graves, Glyphs" PING list --
Archaeology/Anthropology/Ancient Cultures/Artifacts/Antiquities, etc.
The GGG Digest -- Gods, Graves, Glyphs (alpha order)
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.