Free Republic
Browse · Search
General/Chat
Topics · Post Article

To: some tech guy
Were you viewing the actual content of the HTML page or your browser's rendition of it?

It's slippery. E.g., View Page Source will give different results from View Selection Source. The actual page, downloaded using a non-browser such as wget, may show entities, whereas the likes of View Selection Source or cut and paste into your favorite hex dumper will show clean UTF-8.

If I copy from the browser window, and paste through xxd, I see UTF-8. But, if I look at the actual HTML, I see entities. That is the key to the problem.

26 posted on 11/19/2015 11:58:40 PM PST by cynwoody
[ Post Reply | Private Reply | To 22 | View Replies ]


To: cynwoody

On the front page, straight-up curl. I also tried wget. Same - it’s UTF-8 all the way.

I’m not seeing entities at all.

No, wait, you have nailed this. Some dumbass thing is marking up UTF-8 to entities. I see entities when I view the thread. The code trying to make safe HTML is messing up because it doesn’t understand UTF-8.

Nice work, sir or madam.

That’s the diagnosis.


29 posted on 11/20/2015 12:05:00 AM PST by some tech guy (Stop trying to help, Obama)
[ Post Reply | Private Reply | To 26 | View Replies ]

To: cynwoody

and the filter code is only operating when comments are displayed, which makes sense, because on the main page everything is pre-filtered. For comments, you want to run a filter again in case some commenter is trying to be fancy with JS or something.

That’s why and what it is.

I feel like one of Dr House’s interns. Great analysis.


30 posted on 11/20/2015 12:17:54 AM PST by some tech guy (Stop trying to help, Obama)
[ Post Reply | Private Reply | To 26 | View Replies ]

Free Republic
Browse · Search
General/Chat
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson