On Monday 12 September 2005 02:51, Francesco R wrote: > Jason Stubbs wrote: > >On Monday 12 September 2005 00:05, Francesco R wrote: > >>http://dev.gentoo.org/~vivo/doc/mysql-update.html > > > >With step 2, you should probably mention the issues that can arise with > >non-ASCII data in char fields. The character set really needs to > > specified in the dump. After the upgrade to 4.1, the default charset of > > the server should be set to something compatible and then the charset > > of the data should be specified to mysql when re-importing the backup. > > --default-character-set=charset > should be that of my.cnf config file, mysqldump don't permit an atomic > setting of this variable. > The only option for this kind of users is to atomically dump the tables > and then concat the results. > > Importing in mysql-4.1 it's ok, provided your default character set is > utf8. > > Russian, asian whatever person has experience on this please speak now > to correct what affermed here. I had a 4.0 database with strings mostly stored in SJIS that I upgraded to 4.1 a while back. 4.1 then uses the "connection characater set" to do on the fly translation of db encoding to connection encoding. This also happens when importing data so if you haven't got the character set of the data correct, it'll get corrupted on the way in. Related to the automatic conversion, some fields in the DB contain raw URLs with un-URL-ified parameters that can be in any character set. These fields had to be set to BINARY for them to be usable. Another gotcha related to this is that php's mysql support defaults to latin-1 encoding (at least on my current installation) and has no setting for it. The only solution there was to execute "SET NAMES ujis" on every connection. -- Jason Stubbs