[Moved to Wiki] Correct Encoding of CMSimple Content

Please post answers on the most frequently asked questions about CMSimple
Locked
Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

[Moved to Wiki] Correct Encoding of CMSimple Content

Post by Connie » Tue Nov 25, 2008 6:39 pm

Topic closed on author's request because it's moved to CMSimpleWiki:
http://www.cmsimplewiki.com/doku.php/ed ... uwiki__top
Thu Dec 11, 2008, Holger




To acchieve correct presentation of websites, it needs a harmonic configuration, as many elements contribute:

Whenever you see text in a browser, a symphony of settings and configurations is played:

- server
- browser
- CMSimple configuration
- templates
- language files
- online editor
- ...

As a website-visitor you have one problem: you must configure your browser to automatically detect the encoding of the actual webpage, not using a default encoding for all websites of the world

As a website-producer, you have far more problems, unfortunately

but there is help, as knowledge about settings and configuration will help you to fix this problem.

So here starts a short series of postings, each one describing one facette of the problem. I ask you to comment and contribute. At the end, we will have a real "how to..." which we can add to the wiki

So here we go

Number 1 - 6 are important for all installations, not only for FCKEDitor, number 7 describes your configuration with FCKEditor

Content:

#1 adjust your browser correctly: http://www.cmsimpleforum.com/viewtopic. ... 2123#p2114
#2 find out if the editor is the problem http://www.cmsimpleforum.com/viewtopic. ... =404#p2116
#3 Check the server configuration http://www.cmsimpleforum.com/viewtopic. ... =404#p2117
#4 CMSimple-Settings http://www.cmsimpleforum.com/viewtopic. ... =404#p2118
#5 ther template-related actions http://www.cmsimpleforum.com/viewtopic. ... =404#p2119
#6 CMSimple Files: language file, content, plugin files etc. http://www.cmsimpleforum.com/viewtopic. ... =404#p2120
#7 FCKeditor configuration http://www.cmsimpleforum.com/viewtopic. ... =404#p2122
#8 additional information and weblinks http://www.cmsimpleforum.com/viewtopic. ... =404#p2123
Last edited by Connie on Wed Nov 26, 2008 2:47 pm, edited 5 times in total.
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

1) adjust your browser correctly

Post by Connie » Tue Nov 25, 2008 6:45 pm

whenever you see some funny question marks or other unexpected characters instead of special characters like german umlauts for example, üäöÜÄÖß,

check this:

your browser should detect the character encoding of the website automatically and not set a fixed encoding itself

in Firefox you can define that for example: View / Character Encoding / Automatically
Other Browsers have the similar functions
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

2) find out if the editor is the problem

Post by Connie » Tue Nov 25, 2008 7:59 pm

As a CMSimple User and Administrator, you have to check far more things than a website visitor, for sure

First you must identify whether the editor is the reason for the broken character encoding or not

To do this,

1) disable Javascript in your browser
2) open one page in edit mode. You will get only a textarea to enter text, no editor at all
3) enter some text with special characters
4) submit and check the page in View-mode.

If the special characters are wrong, it is definitely not the editor who causes the problem.
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

3) Check the server configuration

Post by Connie » Tue Nov 25, 2008 8:04 pm

This information relates to APACHE webserver, as I have no knowledge of other servers unfortunately

1) Check the HTTP_ACCEPT_CHARSET of APACHE:

write the following php-script, name it info.php for example, upload it to your server and call it in the browser

do not forget to delete this file after the check, for security reasons

Code: Select all

<?php
phpinfo();
?>
if you find the following information in the output, the server is configured well (for our purposes)
HTTP_ACCEPT_CHARSET UTF-8,*
configured well means that your server accepts UTF-8 plus eventually other encodings

if you do not see UTF-8 there,
- you might ask your hosting-provider to add UTF-8 to your configuration
- you can place a .htaccess-file in the root of your web or in the folder, where you installed CMSIMPLE:
with the following content:

Code: Select all

AddCharset UTF-8
if you see read that the server accepts UTF-8, but not as the first characterset, maybe like this:
HTTP_ACCEPT_CHARSET iso-8859-1,UTF-8,*
then some of the browsers will not use UTF-8, but use iso-8859-1 instead

in that case ask your hoster to change the sequence of the charactersets or use a separate .htaccess-file for your cmsimple-directory (see above)
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

4) CMSimple-Settings

Post by Connie » Tue Nov 25, 2008 8:14 pm

The CMSimple-Settings and the active template must be synchronized:


a) log in in CMSimple

b) go to Settings / edit language / META and set the meta-information like this:

Code: Select all

meta_codepage: UTF-8
c) check that your active template adds no other codepage-information
go to Settinges / edit template and check that the template contains this string:

Code: Select all

<?php echo head() ;?>
but no other meta-information, absolutely forbidden is a statement like the following one:

Code: Select all

<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1" />
because the correct information is added by CMSimple to the output
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

5) other template-related actions

Post by Connie » Tue Nov 25, 2008 8:21 pm

1) there might be a collision between the file format of your template and the relating css-stylesheet when they are formatted in ANSI but not in UTF-8-formate

so it is suggested that you download both files, template.htm and stylesheet.css,
open them in a UTF-8-capable editor
save them in UTF-8 format and
re-upload them to your server into the template-directory

Remember: you need an editor, not a wordprocessor for this step!

If you are under Windows, I suggest Notepad++, which is a really formidable editor

open the file with that editor and use "Format / convert to UTF-8 without BOM " <= without BOM is important!

2) if you whish, you might add the following directive to your stylesheet.css as the very first line:

Code: Select all

@charset "utf-8";
Last edited by Connie on Tue Nov 25, 2008 8:29 pm, edited 1 time in total.
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

6) CMSimple Files: language file, content, plugin files etc.

Post by Connie » Tue Nov 25, 2008 8:28 pm

It is helpful if you continue with the formatting of the files to UTF-8

to do so, download some files of your CMSimple installation, open them in the UTF-8-editor (Notepad++ for example), save them in UTF-8 without BOM format and re-upload them to your CMSimple directory at the server again.

These files are essential:

- content/content.htm
- otherlanguagedirectory/content/content.htm
- the language-files in the folder cmsimple/languages
- check your plugins for relevant files as they also should be formatted to UTF-8 format:
- - language files
- - stylesheets
- - templates
- - include files
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

7) FCKeditor configuration

Post by Connie » Tue Nov 25, 2008 8:50 pm

The FCKEditor, like we distribute it in the FCKEditor4CMSimple-package, is UTF-8 by default.
Older versions of FCKEditor do not support completely UTF-8, but you should not use those old versions for sure

The FCKEditor-Website at http://docs.fckeditor.net/FCKeditor_2.x ... calizationsays:
"An important thing is to save the files in the UTF-8 encoded text format.
Otherwise, some strange characters could appear instead of any special characters used by different languages, like accented letters, symbols, etc. "

The Editor uses some configuration files which you might edit
If so, check them that they are all in UTF-8-file-format.
These files and their location is defined in the editor-configuration:

Code: Select all

FCKConfig.StylesXmlPath = '../fckstyles.xml' ;
FCKConfig.TemplatesXmlPath = '/mytemplates.xml' ;
there are also languages files for the editor in the directory "editor/lang" , which you can check if they are in the correct format or not

if you use Editor-plugins with the FCKEditor, check these plugins as well whether they contain language files etc. which might be stored in a wrong format

FCKeditor4CMSimple keeps the relevant configuration files in the folder "custom_configurations":

- custom_fck_editorarea.css
- custom_fckstyles.xml
- custom_fcktemplates.xml
- fckconfig_cmsimple.js

check them if they are stored in the correct file format
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Connie
Posts: 282
Joined: Thu May 22, 2008 10:11 am
Location: Hamburg
Contact:

8) additional information and weblinks

Post by Connie » Tue Nov 25, 2008 8:54 pm

a)
somebody suggested to add a codepage-directive to forms which you use with CMSimple, so the user-input is stored in UTF-8-format in all cases
to achieve this set your form-definitions in your pages, articles like this:

Code: Select all

<form action="...." method="...." accept-charset="UTF-8">
b)
There is a lot of information about UTF-8 out there in the web, a very interesting one is this:

Can I use UTF-8 on the Web? at http://www.cl.cam.ac.uk/~mgk25/unicode.html#web
|---
Connie Müller-Gödecke, http://www.webdeerns.de

Locked