Check for valid UTF-8 is too restrictive

A place to report and discuss bugs - please mention CMSimple-version, server, platform and browser version
Post Reply
cmb
Posts: 14225
Joined: Tue Jun 21, 2011 11:04 am
Location: Bingen, RLP, DE
Contact:

Check for valid UTF-8 is too restrictive

Post by cmb » Thu Sep 11, 2014 11:43 am

Hello Community,

yesterday I've got a support request where a user was not able to access a freshly installed CMSimple_XH 1.6.2; he got "Malformed UTF-8 detected". It turned out that there was the following variable set in ISO-8859-1 encoding:

Code: Select all

$_SERVER['GEOIP_CITY'] = 'Görlitz';
This variable, as well as some others, are most likely set by mod_geopip, and it contains the city of the visitor. :shock:

Considering the likely case that the webmaster lives in a city that doesn't contain non ASCII characters, everything would be fine, but some of the visitors won't be able to access the site -- a severe bug. Hopefully mod_geoip is not installed on too many servers.

I had a closer look at the $_SERVER variable, and there may be similar issues wrt. other array members. For instance, all paths might contain non ASCII characters and the encoding of the file system is not guaranteed to be UTF-8. And I can imagine that there are other server modules that introduce yet more non UTF-8 encoded entries...

To keep it simple, I suggest that we remove the UTF-8 check for $_SERVER completely (at least for now)[1], and handle the few entries that CMSimple_XH is using on a case-by-case basis. Plugins would have to do the same.

Christoph
Christoph M. Becker – Plugins for CMSimple_XH

cmb
Posts: 14225
Joined: Tue Jun 21, 2011 11:04 am
Location: Bingen, RLP, DE
Contact:

Re: Check for valid UTF-8 is too restrictive

Post by cmb » Sat Sep 20, 2014 10:59 am

cmb wrote:I suggest that we remove the UTF-8 check for $_SERVER completely (at least for now)[1], and handle the few entries that CMSimple_XH is using on a case-by-case basis.
It seems that none of the uses of $_SERVER resp. sv() needs any special handling, so I have simply removed the check (r1372).
Christoph M. Becker – Plugins for CMSimple_XH

Post Reply