Oh yeah! UTF-8 always and often!
I checked out your markup theory. On the main page, the smart quotes related to this thread:
http://www.freerepublic.com/focus/f-news/3362771/posts
are *not* &# or &something markup. They’re straight UTF-8, and work.
[redacted]-imac:~ [redacted]$ hexdump freep2.txt
0000000 e2 80 9d 0a
0000004
But when you click the link, all the quotes are messed up.
I’ve a soft spot for 8859-1, but once you go UTF-8, you never go back.
It's slippery. E.g., View Page Source will give different results from View Selection Source. The actual page, downloaded using a non-browser such as wget, may show entities, whereas the likes of View Selection Source or cut and paste into your favorite hex dumper will show clean UTF-8.
If I copy from the browser window, and paste through xxd, I see UTF-8. But, if I look at the actual HTML, I see entities. That is the key to the problem.
Is it possible that there is a problem using magic quotes or similar on the code that is storing in utf8 and then some sort of caching that is causing the conversion outside of the database? I seem to remember WordPress having a similar issue when they tried an update in the 2.0s that was security related.
My text is UTF-8 by default.
But then, howâs this happening?