Forums / Developer / Will eZ3.1 be supporting mySQL 4.1 unicode (utf-8)?

Will eZ3.1 be supporting mySQL 4.1 unicode (utf-8)?

Author Message

Tony Wood

Friday 02 May 2003 2:50:56 am

Will eZ 3.1 bee supporting the new unicode support within mySQL 4.1 now in Alpha? or should we use postgreSQL?

tia

Tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Bård Farstad

Friday 02 May 2003 3:20:16 am

I will do some tests with MySQL 4.1. Haven't tested 4.1, yet, but I think it should be no problem to use it with eZ publish.

--bård

Documentation: http://ez.no/doc

Tony Wood

Friday 02 May 2003 3:30:17 am

Thanks, We are reviewing the need to go to postgreSQL for multi-language sites that do not fit into a single character set. I'd prefer to stick with mySQL for unicode (utf-8) if possible.

I look forward to your results.

tia

Tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Bård Farstad

Friday 02 May 2003 4:20:58 am

I just had to test it right away. I upgraded my development machine to Ver 13.5 Distrib 4.1.0-alpha changed charsets on all tables to utf-8 and configured eZ publish to use UTF-8 in database, templates etc..

At first glance it seems to work. I wrote an article with norwegian, russian and chinese text in it and it was displayed and stored correctly.

The only problem I found was that when we index chinese text we only split words by spaces, which do not make sense for chinese.

Of course it needs more testing, but I don't think we need to do much to fully support UNICODE(utf-8) with MySQL 4.1.

Please set up a test installation and report any problems you might have with this setup.

--bård

Documentation: http://ez.no/doc

Tony Wood

Friday 02 May 2003 5:35:24 am

Will do

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

cfa cfa

Thursday 14 August 2003 11:07:51 pm

how do you plan on indexing chinese? sounds kind of tricky to me cause there are no spaces between the characters.

any idea on when this feature will be implemented?

... and does the search function (unicode) work ok with postgresql now?

liu spider

Sunday 17 August 2003 5:35:23 am

I just find out eZ's trick in indexing Chinese in Search tables: split all Chinese characters into single ones

But I do not think that works with Chinese, as no one will search for something using Characters rather than words ( Chinese words composes 2 or more Chinese characters).

http://liucougar.scim-im.org
SCIM Input Method Platform
http://scim.sf.net
SJSD Online Editor
http://sf.net/projects/sjsd

Tony Wood

Thursday 28 August 2003 9:20:13 am

Ok,

mySQL 4.1 installed and running 3.1 svn. I ran the instructions here at http://ez.no/developer/ez_publish_3/documentation/installation_and_configuration/configuration/language_and_charset/unicode_with_ez_publish the content is converted and it exists in the db, but XML fields cannot be read by the admin interface. it just appears blank.

Even after running xml fix php -C update/common/scripts/updatexmltext.php

Are there any further instructions for converting the content?

Tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Tony Wood

Thursday 28 August 2003 1:48:05 pm

It also appears that the conversion as documented only converts the index to utf-8. The fields in the table are still as they where created.
Is there a script to change all the fields to Charset utf-8?

tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Tony Wood

Wednesday 03 September 2003 8:58:06 am

ok sussed it.

In my examples (maybe all older v3 sites from v3 beta, don't know), some of the XML field definitions are incorrectly defined as <?xml version="1.0" encoding=""?> when they should be <?xml version="1.0" encoding="utf-8"?>.
If you change each affected record then you'll get your content back.

Note. Take csre when using the updatexmltext.php it just deleted the xml data in my entries... again this may be just me..

I hope this helps someone

Tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future