Except,
1. Nobody suggests that DNA emerged fully formed - it was proceded by an age of RNA-based chemistry.
2. Pre-RNA molecules/structures would still have to be self-replicating, thus introducing a variety of selection pressures that would accelerate the rate of information retention for re-use in a subsequent iteration.
3. How much "data" is contained in a self-replicating compound? Is it more or less than in a sentence of Hamlet? Your Hamlet string has behind it a whole language, with idiom and abstract meaning, embeded in a complex cultural context. The compound only needs to specify how to make a copy of itself.
The language and its development is quite irrelevant here. Itis purely a mathematical problem. The monkey does not have any language base, knows not what he is typing, and the keys could be random numbers. The problem would not change. A string of DNA has many times more information in its structure than does an alphabet.
The reason the example was chosen is that it does give the odds of creating a new gene of smaller than average size. If the gene were to merely copy itself it would give absolutely no benefit to the individual possessing it. The reason why new genes are needed for higher species is that they need new more complex functions. These functions are proviced by genes which are not present in the simpler species such as bacteria. A duplicated gene therefore would be of no help.