> I don't know if this has improved over the years, but my initial > experience with unicode was rather negative. The fact that text > files were twice as large wasn't a major problem in itself. The > real showstopper was that importing text files into spreadsheets > and text-editors and word processors failed miseraby. > > I looked at a unicode text file with a binary viewer. It turns out > that a simple text string like "1234" was actually... > "1" binary-zero "2" binary-zero "3" binary-zero "4" binary zero, etc. That's (as someone has already pointed out) UTF-16, which is the default for some Windows tools (but understood in Linux too). (Even UTF-32 exists where all characters are 4 byte wide, but I've never seen it in the wild.) UTF-8 is normally used on Linux (and ASCII chars look exactly the same there); even for "long characters" outside the ASCII range spreadsheets and word processors should not be a problem anymore. -- Andreas K. Hüttel dilfridge@gentoo.org Gentoo Linux developer (council, qa, toolchain, base-system, perl, libreoffice)