These guys probably are NOT deliberately fudging the data. Although I only have the math necessary to finish a degree in chemistry, and not the PhD in stat alot of these people have, I DO have some experience with this type of programming, having mucked around in the financial markets for years, and having worked as an environmental engineer.
Normalizaton of data (or "curve fitting") is also a real problem when you are trying to analyze data from the financial markets or environmental trends. You are looking for patterns of linear regression (is there market price behavior there that repeats itself with enough regularity that I should buy/sell, say, soybean futures at THIS point in the graph?). I call it the technological equivalent of reading chicken entrails to determine the future.
Anyway, the problem comes from the following.
1) The data is random
2) Within that random data are clear patterns
These seemingly contradictory statements have led some of the best minds and fastest computers and biggest players in the markets to crunch MOUNTAINS of data looking for a way to disprove item #1 above.
The lure is so strong that one is always tempted to look for a mathematical formula that will explain and predict the future. The problem is, that usually the formula only explains PAST patterns and is totally worthless to predict future behavior.
Though some folks who hawk environmental models have the same level of sleazy technoshysterism as the hacks bawling out their wares in Techical Analysis magazine, most of em are just blind to their own prejudiced conclusions. Therefore, they find patterns that TRULY ARE THERE, they are just curve fitted and not very useful.
Peer review is a great way to shut this stuff down as noted here.
As a fellow chemist, I must point out that "the data ARE random" is the correct usage.
You said it very well. When people want to find a particular result within random data, they will tend to do so.
I totally agree with your assertions about patterns in random data (they can be expected). Where I disagree is that PhD professional statisticians can by chance make an error that supports their case without intent. If you and I know about this potential, bet your but they do. There is one area where lesser PhD level statisticians make genuine mistakes, IMO. It is not in data normalization, but in the application of a priori rules to a posteriori questions. So, my starting point is malice on their part, because their mistakes systematically supported their political agenda. Regards,