40% of the genome is indeed retroelements. How they got there is not always entirely clear. Also, these retroelements are not strictly viruses as they do not need to exit the cell in order to replicate. What a retroelement is is a piece of DNA that is converted to RNA and converted back to DNA, which is then reinserted in the genome. It's essentially a parasitic part of your genome, and the sequence is very recognizable as a retroelement. However, the retrotransposition sequence I described above (DNA->RNA->DNA) does not always need to happen. In some cases, there can be so many repeated retroelements in one area of the genome that the machinery can get confused and mistakes can occur during DNA replication (more accurately, during recombination), and the number of retroelements can thus increase.
Scientists know that at least 40% of the genome are these "retroelements" (which are mostly Alu retroelements) because their sequence is so recognizable. These elements may have played a role in generating genomic diversity during evolution, but on an individual level, they are nothing but harmful.
I realize that this explanation may not have been entirely clear, but to really answer your questions would take pages. Suffice it to say that you can test this hypothesis yourself. None of these Alu retroelements make any proteins at all (the typical understanding of what a gene does). But if you look up the Alu sequence, go to ncbi.nlm.nih.gov and paste that sequence into the BLAST database, you will generate more "hits" than you will with any other sequence you can come up with.
Thank you.