yesterday I've got a support request where a user was not able to access a freshly installed CMSimple_XH 1.6.2; he got "Malformed UTF-8 detected". It turned out that there was the following variable set in ISO-8859-1 encoding:
This variable, as well as some others, are most likely set by mod_geopip, and it contains the city of the visitor.
Code: Select all
$_SERVER['GEOIP_CITY'] = 'Görlitz';
Considering the likely case that the webmaster lives in a city that doesn't contain non ASCII characters, everything would be fine, but some of the visitors won't be able to access the site -- a severe bug. Hopefully mod_geoip is not installed on too many servers.
I had a closer look at the $_SERVER variable, and there may be similar issues wrt. other array members. For instance, all paths might contain non ASCII characters and the encoding of the file system is not guaranteed to be UTF-8. And I can imagine that there are other server modules that introduce yet more non UTF-8 encoded entries...
To keep it simple, I suggest that we remove the UTF-8 check for $_SERVER completely (at least for now), and handle the few entries that CMSimple_XH is using on a case-by-case basis. Plugins would have to do the same.