Posted on 03/05/2002 9:45:44 PM PST by Southack
This is part two of the famous "Million Monkeys Typing On Keyboards for a Million Years Could Produce The Works of Shakespeare" - Debunked Mathematically.
For the Thread that inadvertently kicked started these mathematical discussions, Click Here
For the Original math thread, Click Here
Discarding a failed sequence is a courtesy to Evolution, in order to err on the side of probability. Otherwise, if the first character out of any of the data sources was in error (or as soon as the first error occured in any source), then the rest of the entire output string would be invalid.
Rather than discard an entire source of data, the math presumes to restart searching for the valid sequence at the first point after an error.
Further, the math in this article deals with sequencing. At no point does it presume that the data all "forms at once".
It does not matter if groups of coins are flipped together, or if one coin is flipped multiple times. What does matter is the sequence of the resulting output, and for that, the math in this thread calculates it perfectly.
I refuted this in post 373. Your suggestion would be a waste of time because unless the error was in the first link of the chain, adding a single link to the end will not matter at all. According to Jay L. Devore on page 92 of Probability and Statistics for Engineering and the Sciences, 4th ed. (1995, Wadsworth, Inc.),
What does matter is the sequence of the resulting output, and for that, the math in this thread calculates it perfectly.
You are correct on the second part. Your analogy fails on the first. Who has ever suggested discussing an unending chain of base pairs? That would be silly. The math is valid only for discrete trials. The author says so. I have show you where. And statistician will tell so so. Not that it really matters because
Period, the end. Until you recognize that I'm wasting my time.
What part of sequencing do you not understand?
The author's math is for a sequence of data. In this case, the first sentence of Shakespeare's Hamlet: "To be or not to be, that is the question."
What produces our chaotic output? Metaphorical monkeys banging on keyboards.
Where do we search for the desired first sentence above? We search in every sequential part of every output string produced in our example.
Are there intermediate steps? In the creation of this data, yes. The monkeys bang out characters one after another rather than all at once.
Our math is literally looking at every linear string of 41 characters in all of our output. If the first character isn't the desired first character, then we look at the second. If the second character isn't the desired character, then we start looking at the third. If the first 41 characters don't contain the desired sequence, then our math is looking at the next 41 and so on for all possible linear sequential combinations.
Does that miss anything? No.
You don't skip intermediate steps in that math. All intermediate steps are accounted for, mathematically.
That is a fundmental misreading of the author's work. The monkeys type letters after letters, in intermediate steps over time. They do not type all of their output in a single smash of their keyboards.
Are there intermediate steps? In the creation of this data, yes. The monkeys bang out characters one after another rather than all at once.
You are being deliberately obtuse. Whether it takes a monkey a week to type 41 characters or 5 seconds, the data is examined only as a unit. But that is not the only way that your sequence can form.
What if there was another way. Say, one monkey types "question." right now. Two days later, another monkey types "To be or n" while a third monkey manages to get the sequence "ot to be, tha". A week down the road, the first monkey types out "t is the ".
These fragments are cast out into the big wide world and wander around and bang into each other. When the ends align, the fragments link up and your sentence is complete.
Those are the intermediate steps that are not accounted for. The author assumes that each trial is exactly the right length every time because the calculations cannot work for anything else. That is why the application of these calculations is incorrect, and your article proves exactly nothing.
Then you don't understand that the mathematical probability remains the same whether you flip ten coins in one group, 15 coins in another, and 16 coins in still another group for a total of 3 intermediate steps, or one coin 41 times.
The final probability for randomly hitting a pre-determined outcome of 41 units is precisely the same either way.
That's what the math addresses: the final probability of data in a chaotic, unintelligent environment correctly sequencing itself. Except in this case, we're dealing with much lower probabilities of getting each correct character out of far more possibilities than a mere two sided coin toss...
Why does the author assume there there can be only a single monkey generating each fragment, and that that fragment cannot persist beyond a single trial?
Probably because that's an easy way to visualize an abstract concept. Remember, however, that when we are talking about random data linking up in sequence to form a useful output from a chaotic environment, the mathematical odds of a link happening to any existing stream of data, valid or not, remain the same.
So even though the author doesn't describe it in your manner, the mathematical probability remains the same for any given sequence of data self-forming even if we are considering the output of various sources interacting with each other.
The odds of a valid segment from one source combining with either the same source's future output or another source's current/past output remain identical.
This is because the author's math is actually taking into account the ENTIRE sum of output from all sources over the entire 17 Billion years, peering into that data pool, and accurately calculating the probability of our desired sequence existing somewhere therein. Because there is no intelligence in this theoritical system to decide to keep "valid" sub-parts from linking with invalid sub-strings of data, the mathematical odds remain precisely the same for finding the final desired output sequence no matter how much the unintelligent data mixes with the output from any and all sources.
Hey, it's just math.
Clearly you haven't given me credit for admitting my mistakes and praising my opponents' good points on numerous FR threads.
Perhaps a more intellectually honest way to phrase your above attack could have been "Math cannot be reasoned with..."
Which is true in the sense that math cares not about politics or feelings. Math has a truth all its own, and you can't sing a sad song to get math to "reason" with you or compromise with you.
In this thread, the math clearly outlines a distinct probability/improbability. Numerous posters have flailed about on this thread pretending that the math was "refuted" or wrong or whatever, but no one has been able to challenge the math with math (a sure sign that the arguments against said math are political or religious in nature, not mathematical).
Demonstrate with math (and not with poor man's humor) where the flaw resides in this thread, and you'll find a ready applause from me.
Fail to do so with math, and expect to receive my continued skepticism of your claims.
Okay, so I lied. I don't know why, but I'm going to offer one more response.
The math has not been challenged. No one has tried. The math in the article is correct insofar as the formulas given are legit and the results of the calculations are properly reported. The posters who stand opposed to the article have argued (successfully in most cases) is that the math is misapplied-- that the math correctly describes the probability of a system almost, but not quite, entirely unlike those systems found in nature. The math is correct IF chemical reactions are entirely random and IF each chain is exactly a likely as every other chain and IF each chain is examined as a discrete, independent unit and IF said chains are absolutely unaffected by every single previous event.
The author must make these assumptions (and several others, many of which have already been identified elsewhere on this thread) in order to claim that his calculations apply. But his model is far too simplistic and simply cannot account for a continuous sequence of data. He says "Look at how big these numbers are, it could never happen," and provides some fairly standard discrete probability formulas, ignoring entirely the fact that we do not live in a discrete world.
You have been more gracious than most, and I will readily admit that. But for now we are talking past each other. My posts, at least, are for the lurkers, anyway. Neither of us is going to persuade the other, but I respect your willingness to stay above personal attack and insult; I have attempted (mostly successfully, I hope) to offer you the same courtesy.
Best,
Condorman
And that is the point upon which we disagree. You are arguing two sides. On the one hand, you say that the math doesn't apply to chemical reactions because said reactions aren't random (and to that point I generally agree), but then you turn around and say that the author's math doesn't account or apply to data (and on that point, I certainly disagree).
Data is different from chemical reactions. We can have chemicals react all day long, but that doesn't mean that they store data.
Of all the chemical structures in the world, DNA stood alone in storing data until Man came along and created paintings and later writing.
But it wasn't the chemicals that comprised DNA that was unique. Those chemicals are found in plenty of other compounds in which data is NOT stored, in fact. Nor is it the fact that those particular acids and bases reacted or linked with each other, as they do that in other compounds as well.
No, what makes DNA so intriguing is that those chemicals are sequenced in a manner that accurately stores data (and then going beyond the math in this thread, that DNA processes said stored data as well as replicates itself).
And the author's math is entirely valid for calculating the probability of data managing to sequence itself without intelligent intervention.
Whether we are calculating the probability / improbability of useful data sequencing on your hard drive after a lightning strike (or two or 17 Billion), or that we are calculating the odds of data sequencing itself into DNA that's capable of creating a sustainable life form, or even calculating the odds of monkeys typing out Shakespeare's Hamlet, the math in this thread as well as the First Math Proof thread is equally valid.
This is because in every event mentioned above, it is the sequence of data, not the mere odds of valid chemical reactions occuring, that we are calculating.
Are you VERY sure of that? What right do you have to limit the travels of Jesus?
Okay, maybe you will not find sandal print on the moon, but remember, this is Guy that could walk on water.
There are two type of mathematical arguements: right ones and wrong ones. Which catagory do you think mathematical arguments "of a sort" fall into?
There is a third type of mathematical argument... not proven.
Such is the case with random mutagenesis over time as filtered by natural selection.
Lurker to Condorman: Thank You.
He was Russian, Gore's American, and claiming to know what the Internet is and what runs it. Imagine hiring for a network engineer (American!) and he keeps referring to a "TCP/PI Stick."
I'll send him the link and see if he has time.
Now you're catching on. There is MORE than just a mispronounciation needed in order to show a lack of knowledge of a technical subject. Clearly semantics alone won't do it.
That was my point all along...
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.