Posted on 06/15/2003 10:36:08 AM PDT by Alamo-Girl
1. The dependence on U, as there is no unique way of specifying this.
Yes, that is one problem that has bothering me about it. The rules are algorithms and the 'shortness' and ability of an algorithm to produce a result is highly dependent on the abilities of the programmer, and yes even on the intuitional abilities of the person writing it. Seems there is something highly subjective in the formula.
2. Random sequences have maximum complexity
Yes, that bothers me the most as I pointed out before regarding the content of information. Someone may spend his life studying how complex the grains of sand on a beach are, but that will be of little use. On the other hand spending time studying the DNA code can have great benefits.
BTW - we discovered DNA some 50 years ago and we still have barely begun to uncover its secrets.
You really are funny! Could you please freepmail me my next week's postings, my fingers are getting tired.
FWIW I do think it's possible albeit rare for human beings to lay aside their prejudices and deal with reality as it is. We are, I think, in the end stages of a great clash in worldviews with the mechanistic losing. I think our principal problem is myopia; i.e. it's hard to know reality from the perspective of a whirling speck of dust in a vast cosmos and as a conscious "animal" thereon. I won't belabor the point beyond saying that our best scientific explanations are vastly deficient as to qualia and that I believe we will ultimately have to "go there" if we are to move forward in our understanding of reality. The physicists have been very helpful but have turned a blind or too timid eye toward the implications of their findings. As I said to bb privately, I don't see how any physicist with a modicum of understanding could be an atheist. As my very close friend, Owen Fairfield, once crudely put it: "Quantity is just another quality".
Indeed, I found those two problems with Kolmogorov Complexity to be quite illuminating. I also found it interesting that the author leaned to Shannon. In fact, I found that so frequently I decided to boldface it (LOL!)
I do hope you get a chance to read over the 101 page pdf file (or at least the first 30 pages or so.) It is fascinating and shows an attempt to address many of our natural objections, i.e. that the genetic code is more like language than meaningless bit streams, that phrases carry meaning (intelligence) and that ought to be the basis to determine complexity/entropy.
It seems to me that gore3000's objection is much the same as mine.
Indeed, an infinite binary set will contain all possible combinations. That goes with the concept of infinity. But Kolmogorov Complexity et al cannot be computed unless the Turing Machine halts. IOW, the infinity of chance tells us nothing useful, in particular nothing useful concerning evolution biology.
Because there was a beginning, the timeline to now is finite. Likewise, the environment was finite over time. Moreover, the fossil record gives us some benchmarks on the timeline.
So the question remains as to what is actually possible, from a mathematics/information theory point of view.
Yockey says no to abiogenesis based on Shannon. A prize if offered for a theory of abiogenesis that would work.
And beyond abiogenesis, all speculations of autonomous self-organizing complexity must survive the same rigor.
This I believe is the same point the gore3000 is making, i.e. infinity cannot answer: to what extent is evolution mathematically possible.
BTW, I've briefly scanned the Grandpierre article betty boop offered to get the flavor of it and will now go over it in more detail. It does look rather interesting as he puts emphasis on consciousness and emotion - as a force with regard to the universe/reality.
Also included is a little help (emphasis mine) for determining an information content for an infinite bit string (as was suggested by Doctor Stochastic.)
Ergodic theory can be described as the statistical and qualitative behavior of measurable group and semigroup actions on measure spaces. The group is most commonly N, R, R+, and Z
A means of describing how one state develops into another state over the course of time. Technically, a dynamical system is a smooth action of the reals or the integers on another object (usually a manifold)...
The deterministically chaotic systems constitute a true subset of the set of untreatable problems. Moreover, unique connections between dynamical system properties--e.g., nonlinearity, non-integrability, and dynamical instability on the one hand, and algorithmic complexity and limited long-time computability on the other hand--cannot be established. There are effectively treatable (i.e., algorithmically computable) systems which do not admit of a closed-form solution as a function of time (e.g., transcendental equations); there are nonlinear systems which are integrable (e.g., solitons); there are linear systems which are uncomputable, or untreatable (Leiber, 1996b, pp. 38-40); in the framework of mathematical ergodic theory it has been shown that algorithmic Kolmogorov complexity is not synonymous with dynamical instability (or "deterministic randomness; Batterman, 1996); also, exponential instability is compatible with well-posedness, i.e., with the existence of a closed-form solution. In summary, for the instability hierarchy of dynamical systems there is no comprehensive characterization available in terms of computability concepts.
Juergen Schmidhuber's home page - Artificial Intelligence - New AI - Recurrent neural networks
Kolmogorov's (left) complexity K(x) of a bitstring x is the length of the shortest program that computes x and halts. Solomonoff's algorithmic probability of x is the probability of guessing a program for x. Chaitin's Omega is the halting probability of a Turing machine with random input (Omega is known as the "number of wisdom" because it compactly encodes all mathematical truth). All of this I generalized to non-halting but converging programs. This led to the shortest possible formal descriptions and to non-enumerable but limit-computable measures and Super Omegas, and even has consequences for computable universes and optimal inductive inference.
Generalized Algorithmic Information - Generalized Algorithmic Probability - Super Omegas
The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with one-way write-only output tape. This naturally leads to the universal enumerable Solomonoff-Levin measure. Here we introduce more general, nonenumerable but cumulatively enumerable measures (CEMs) derived from Turing machines with lexicographically nondecreasing output and random input, and even more general approximable measures and distributions computable in the limit. We obtain a natural hierarchy of generalizations of algorithmic probability and Kolmogorov complexity, suggesting that the ``true'' information content of some (possibly infinite) bitstring x is the size of the shortest nonhalting program that converges to x and nothing but x on a Turing machine that can edit its previous outputs. Among other things we show that there are `Super Omegas' computable in the limit yet more random than Chaitin's ``number of wisdom'' Omega, that any approximable measure of x is small for any x lacking a short description, that there is no universal approximable distribution, that there is a universal CEM, and that any nonenumerable CEM of x is small for any x lacking a short enumerating program. We briefly mention consequences for universes sampled from such priors.
I have started reading it, but it will take a few days. I did find it interesting that an analysis of Shakespeare's works breaks all the rules which apply so well in other works! Genius always breaks the rules.
It should be noted that the only examples of self organization given by evolutionists are of chemical bonding. Now there is a difference between this and the organization of DNA. First of all unlike chemical bonding which does not occur between all materials, DNA is arranged in all possible combinations. Second, the sequences in chemical bonding are not varied, but repeat themselves over and over. DNA, to do its work of coding, by necessity cannot be a series of endlessly repeating bonds. It needs to be malleable in order to code successfully (and in fact it is not an endlessly repeating chain but a varied one). Further, it is a scientific fact that the base pairs of DNA have no bonds between each other, but are joined by sugar-phosphate bonds on the outside of the chain.
No. The set (0,0,0,0,0....) will not. It's not as easy as it seems to actually list a set that has all combinations at first glance. Gore3000 made the claim that simple rules cannotgenerate complex objects. I merely gave an example of a simple rule that does so. Gore3000 seems not to think that the works of Shakespeare are contained in the sequence generated by the rules that I gave. In that, Gore3000 is mistaken.
There are other simple rules that generate complicated objects. Example:
Begin
Take any equilateral triangle.
Pick an arbitrary point in its plane.
Loop
Pick one of the vertices at random.
Go halfway from the point to that vertex.
Mark the resulting point.
Pool
Nigeb
The above generates a complicated object only with random choices. It's instructive to program it and watch.
IMHO, self-organization in DNA has more in common with language than anything else. For that very reason, in all this research I've been doing this weekend, these authors particularly stand out: S. Naranan and V.K. Balasubrahmanyan.
In the abstract I linked at 595 (and repeat below)- they have arrived at some very important conclusions - namely, that the genetic code is (a) optimized and (b) redundant to accomodate errors, (c) language like and (d) speculate that it had to have been of variable length in the past. That last observation is a stunner because it is what would be necessary for evolution to work.
To put it another way, the entropy (disorder) of the information content should never decrease over time, but here they allow for truncation evidently. That leaves me wondering if evolution theory demands of a violation of entropy...
The genetic code is a mapping of 64 possible triplet codons from a 4-letter alphabet (A, C, G, U) into 20 amino acids and one STOP signal for protein synthesis. The pattern of degeneracies in codon assignments to amino acids suggests that there is an underlying variable length code, which meets all the optimal coding criteria dictated by Shannon's Information theory. The genetic code can be viewed as an instantaneous, prefix code with unique decipherability and compactness. Optimal codon assignments and average code lengths for 20 species from 10 groups in the phylogenetic tree have been determined from the available data on codon usage, using the Huffman algorithm.
The average binary code length of the genetic code exceeds the optimal average length of the Huffman code only by 2 to 5%, showing that the genetic code is close to optimal. But functionally, the genetic code is a fixed length code of 6 binary bits (3 bases). This provides the needed redundancy (? 25%) for tolerance of errors in the DNA sequences due to mutations.
This hybrid character of the genetic code, combining the advantages of variable and fixed length codes, supports a speculation that in the past the genetic code could have been a variable length code which has evolved to its modern version.
The DNA sequence bears striking similarities to linguistic discourses from an Information Theoretic view. Both are complex adaptive systems with intrinsic elements of order and randomness. The complexity parameter, which we had defined earlier for linguistic discourses, is close to maximal in both DNA and natural languages. In this article, the first of two parts, we have focused on the variable length genetic code. In Part II, we deal with DNA sequence treated as a complex adaptive system and some possible correlation of certain parameters with evolution.
I find the last statement somewhat stunning also especially after the claim that it is optimized. That there is some redundancy in the reading is obvious since there are 64 possible combinations and only 21 things needed to be coded so it is possible for a mutation to have no effect at all. Now as to the variable length part, one can say that it is variable length now if one means by it that different species have different size genomes, but again if the code is optimized, don't see how such a claim can be made. In actuality also, I do not see how one can 'shop off' part of a genome and end up with the same creature or even a viable one.
You said: No. The set (0,0,0,0,0....) will not.
I give up trying to simplify and help your algorithm! So going back to your statement at 560:
If it were truly random and infinite, Id agree that Shakespeare would eventually show up. With this algorithm, no it would not.
Moreover, Kolmogorov Complexity could not be calculated unless the program halts.
On post 613 you said:
The complexity is also the same up to an additive constant.
Of course, language like functions, optimization, adaptability, redundancy protection against mutations, low entropy would be expected from an intelligent designer!
The zeroes compliment (all 1's) is unprintable, but so are the zeroes themselves (all 0's).
In fact of the 256 possible combinations (zero relative) in an 8 bit ASCII representation, 166 are unprintable or not likely to appear in any Shakespeare manuscript (such as the square root sign.) Included among the 90 which are printable, are such improbable (for Shakespeare) characters as the backslash.
So although one can see from the progression of the powers of two in the above list, that the last binary representation in a "row" will always be "all ones" ... when collected in groups of 8, that is only 1 of the 166 possible non-Shakespearean characters.
IOW, to get Shakespeare - for the full character length of the manuscript, none of those 166 may appear and combination of the other 90 must be in the exact order.
But the output pattern of the algorithm is so structured that the greater the power of two, the longer the high order bits will remain "1" in the the count-up for the row, i.e. the deviation occurs to the right until the next lowest "1" registers on the left. When it reaches all ones, the number of bits increase, and a new row starts.
Thus we would see frequent occurrence of zeros compliments throughout the output bitstream.
Most of the types of complexity listed in that appendix appear to be limited or special cases of AIC, and therefore fall under its umbrella. There has been some mathematical effort in the last three years or so to unify all the concepts of "complexity" found in various places. I believe there are now quite a few proofs that strongly suggest AIC is the grand-daddy of all "complexity", and the more work that is done, the stronger this assertion is. I made this conjecture back in the late 90s (and I wasn't the first), but it seems that others have actually taken the effort to prove it.
As a technical nit, the real value for the compressibility of Shakespeare is probably around 40x rather than 4x, this based on real compressibility data points. Not terribly important, except that this would be an order of magnitude reduction of the exponent.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.