Forums / Install & configuration / What is happening in "Time accumulators: String conversion in mysql"?
Jan Borsodi
Thursday 12 August 2004 8:14:06 am
utf-8 is OK allthought it would be faster if it was stored using the current internal charset (removes conversion need).We already do this for the XML datatype but it hasn't been implemented for the other datatypes yet.
Also <i>unicode</i> is the character set but is not an encoding so cannot be used for storage, however unicode has several encodings defined.utf-8: The most common in stored media, uses 1 to 6 bytes for storage, ie. it is variable and works seamlessly with existing 8bit string code. However it is a bit slow due to the variable size.
usc2: Stores using double-byte, much faster since lookup is constant and quite often used internally in programs. Unfortenately doing this in PHP using PHP code only could quite easily be troublesome
usc4: Similar to usc2 but uses four bytes (since the initial 2 bytes were not enough for all languages in the world, something like 21 bit is needed I believe).there are also other encodings (like the non-standard utf-7.5) but hardly used.
So storing utf-8 in 8bit only databases is OK as long as you don't try to do text operations on them in the database.
-- Amos Documentation: http://ez.no/ez_publish/documentation FAQ: http://ez.no/ez_publish/documentation/faq
Tony Wood
Thursday 12 August 2004 8:49:45 am
Thanks Jan,
I did not know about being able to stoe utf-8 in 8bit db... interesting.I noticed various utf-8 formats in MySQL 4.1.3. I used utf8-general is this what you would advise?
--tony
Tony Wood : twitter.com/tonywood Vision with Technology Experts in eZ Publish consulting & development Power to the Editor! Free eZ Training : http://www.VisionWT.com/training eZ Future Podcast : http://www.VisionWT.com/eZ-Future
Friday 13 August 2004 1:20:19 am
I'm not entirely sure what they mean about utf8-general, I have never heard about this before.
I found this page on mysql.com which explains the different collations (sorting) based on language. http://dev.mysql.com/doc/mysql/en/Charset-Unicode-sets.htmlIt could be related to that, do you know of a page on mysql.com that explains utf8-general?
Another interesting property of UTF-8 is that when you are only using characters from ASCII (7bit, 0-127) it will only store one byte and is fully compatible with older ASCII based programs.
Georg Franz
Friday 13 August 2004 4:41:25 am
-> Bard: thanx for that fix! It works perfectly and speeds up my installation a lot!(Please close my relating bug report)
Kind regards,Emil.
Best wishes, Georg. -- http://www.schicksal.com Horoskop website which uses eZ Publish since 2004