Replies

Test 2 was built by copying the raw HTML &xxx markup from the result of test 1. It didn’t suffer from the snowball effect that others see.

Something in the software is not understanding UTF-8.

Test 3: &#E996AA;

37 posted on 01/07/2016 10:49:33 PM PST by some tech guy (Stop trying to help, Obama)

To: some tech guy

Something in the software is not understanding UTF-8.

The server is mangling UTF-8 sequences for chars outside the 7-bit ASCII range. These consist of multiple bytes in the 80-FF range. The server looks at each byte of the multi-byte UTF-8 sequence and substitutes an HTML entity. The browser then renders the result as several characters of garbage instead of the character intended.

E.g., E2 80 9C is UTF-8 for the left curly double quote. The server substitutes an entity for circumflex lower case a for the E2, the euro sign for 80, and the trademark symbol for 9C.

43 posted on 01/07/2016 11:00:08 PM PST by cynwoody

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794