Forums / General / Problem with special characters in EZ 4, æøå

Problem with special characters in EZ 4, æøå

Author Message

H. K.

Wednesday 30 January 2008 7:03:20 am

After upgrading to EZ 4 special characters like æøå turn into scribble like Ã¥.

Pasted into the online editor, when publish or saved all the text in the sector with special characters is removed.

Is this a bug, or must activate some special feature in the site access settings?

Piotrek Karaś

Wednesday 30 January 2008 7:40:27 am

Was your previous database UTF-8 encoded? Does your database support UTF-8 right now?

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

H. K.

Wednesday 30 January 2008 10:01:24 am

I am not sure about my database encoding. I am using the same system as I did under 3.8, 3.9 and 3.10, except php 5.2 and a newer version of eaccelerator.

Piotrek Karaś

Wednesday 30 January 2008 5:24:59 pm

You should definitely check that, as UTF-8 is required for eZ Publish 4.0:
http://ez.no/ezpublish/requirements

If you're still using database with no UTF-8 support, then that's most likely what is going to happen. If you moved to a database without properly converting your existing database, you would probably loose/destroy string-like data. Hope you've properly backed up your site before upgrade. Unless your site was previously UTF-8 based, I'd expect your upgrade to require additional data operations (although I might be wrong here).

I have also once had a problem with a tested hosting provider, who automatically called 'SET NAMES utf8' for MySQL connection, which resulted in something like a double connection encoding and similar problems for all native characters. As long as you haven't changed the server, don't think that's the issue here.

Good luck,
Piotrek

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

H. K.

Thursday 31 January 2008 8:36:31 am

Ok I discovered that my database is encoded as iso-8859-1. I have recent backups of both ez publish 3.9.2 and 3.10.

I found the following information online: What do you recommend?
http://ez.no/developer/forum/general/convert_from_iso_8859_1_encoding_to_utf_8
http://climbtothestars.org/archives/2004/07/18/converting-mysql-database-contents-to-utf-8/

Is it possible to configure EZ publish 4 to work with iso-8859-1 ?

Piotrek Karaś

Thursday 31 January 2008 9:15:50 am

Sorry, no empirical experience with that. I would just follow those posts/articles with a backup in mind ;) Also, you don't have to apply changes globally, test your solutions on one table for example. Also remember, that if you manipulate data with tools such as PHPMyAdmin, connection encoding is not without impact on how data gets imported or exported.

I'm not sure whether it is possible to use previous encodings with eZ 4, but I wouldn't recommend it, it may be a headache when it comes to translations, updates, multibyte string operations etc...

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

H. K.

Friday 01 February 2008 2:47:59 am

Followed some instructions on convert the database into utf8, and now my site "seems to work."

Old text encoding in iso-8859-1 is still not displayed correctly (should have converted the database before moving to EZ 4), but when I add new text to the site special characters are displayed correctly.

For the recorded this process works.

- Issue the following commands to ez publish database

ALTER TABLE ezapprove_items CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezbasket CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezbinaryfile CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_group CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_item CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_item_group_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_item_message_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_item_participant_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_item_status CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_notification_rule CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_profile CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcollab_simple_message CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentbrowsebookmark CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentbrowserecent CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentclass CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentclassgroup CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentclass_attribute CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentclass_classgroup CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentobject CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentobject_attribute CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentobject_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentobject_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentobject_tree CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontentobject_version CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcontent_language CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezcurrencydata CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezdiscountrule CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezdiscountsubrule CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezdiscountsubrule_value CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezenumobjectvalue CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezenumvalue CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezforgot_password CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezgeneral_digest_user_settings CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezimage CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezimagefile CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezimagevariation CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezinfocollection CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezinfocollection_attribute CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezkeyword CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezkeyword_attribute_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezmedia CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezmessage CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezmodule_run CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezmultipricedata CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eznode_assignment CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eznotificationcollection CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eznotificationcollection_item CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eznotificationevent CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezoperation_memento CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezorder CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezorder_item CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezorder_status CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezorder_status_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpackage CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpaymentobject CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpdf_export CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpending_actions CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpolicy CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpolicy_limitation CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpolicy_limitation_value CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezpreferences CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezproductcategory CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezproductcollection CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezproductcollection_item CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezproductcollection_item_opt CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezrole CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezrss_export CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezrss_export_item CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezrss_import CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsearch_object_word_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsearch_return_count CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsearch_search_phrase CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsearch_word CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsection CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsession CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsite_data CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezsubtree_notification_rule CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eztipafriend_counter CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eztipafriend_request CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE eztrigger CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezurl CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezurlalias CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezurl_object_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezuser CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezuservisit CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezuser_accountkey CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezuser_discountrule CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezuser_role CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezuser_setting CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezvatrule CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezvatrule_product_category CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezvattype CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezview_counter CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezwaituntildatevalue CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezwishlist CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezworkflow CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezworkflow_assign CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezworkflow_event CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezworkflow_group CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezworkflow_group_link CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


ALTER TABLE ezworkflow_process CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


UPDATE ezcontentobject_attribute SET data_text = REPLACE(data_text, 'xml version="1.0" encoding="UTF-8"','xml version="1.0" encoding="UTF-8"');

- Then in override/ i18n.ini.append, configure the right charset:

[CharacterSettings]
Charset=utf8

- And make sure that in site.ini.append, the Charset setting under the DatabaseSettings group is left empty (then the charset from i18n will be used):

[DatabaseSettings]
Charset=

- Clear cache

Piotrek Karaś

Saturday 02 February 2008 12:15:59 am

Great!
I think there are some encoding-converting tricks to bring multibyte characters stored in a particular old system to normal (as long as they were not cleared). I have once or twice saved my databases this way, but that required experimentation and would be able to tell how it was done ;)

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu