public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Florian v. Savigny" <lorian@fsavigny.de>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Kernel update messed up console encoding
Date: Sat, 28 Feb 2009 18:38:56 +0100	[thread overview]
Message-ID: <0ML2xA-1LdT9M2LrY-0006L9@mrelayeu.kundenserver.de> (raw)
In-Reply-To: <20090228142648.GB20212@marvin.heimnetz.local> (message from Sebastian Günther on Sat, 28 Feb 2009 15:26:48 +0100)


Hi Sebastian,

  > > But Emacs displays the lower-case umlauts followed by a space
  > > etc. etc. ...

  > what does file say about the offending files?

I was not actually talking about files when I mentioned Emacs, but
what I see when I *type* into Emacs (such as in this mail
message). But in case you mean what that produces when I save the
result of what I typed into a file, I ran a few tests, and the results
were mixed:

For the 3 lower-case umlauts, file reports UTF-8, consistent with the
number of bytes (i.e. the file length): 3 characters, 6 bytes. The hex
representation of the 6 bytes is: c3 a4 c3 b6 c3 3c.

For the three upper-case umlauts and for the eszett, file reports
iso-8859, also consistent with the number of bytes: 3 characters, 3
bytes. The code position is, however, definitely wrong: it is always
hex c3 (which would be the upper-case A tilde in iso-8859-1, and four
different letters can hardly have the same code position.)

To me this looks as if Emacs puts the first half of the byte sequences
(always the hex c3) into the buffer, while trying to interpret the
other half (see list below) as a command: it will say something like
"\204 is undefined". I am quite certain \nnn is an octal number.

eszett: \237 (hex 9f, dec 159)
A uml: \204 (hex 84, dec 132)
O uml: \226 (hex 96, dec 150)
Uuml: \234 (hex 9c, dec 156)

If I am right, the keys thus send:

eszett: c3 9f
A uml: c3 84
O uml: c3 96
U uml: c3 9c
a uml: c3 a4
o uml: c3 b6
u uml: c3 3c

I would assume that these sequences are the UTF-8 representation of
the respective characters (but I don't have a table to figure that
out).

Sorry if the whole thing was diffcult to follow. I should perhaps have
mentioned that for the upper-case umlauts and the eszett, Emacs not
only complains, but also inputs an "unknown" character into the
buffer, represented by a '?' in reverse video. That's apparently the
hex c3 byte.

  > Emacs always uses the enconding of the file, where as an redirect
  > uses the locale, iirc.

I know; normally it can figure it out - I think this ability is not
compromised in any way (I can e.g. open an XML file encoded in utf-8,
and will see "11u" in the mode line). Also, please note that under X,
Emacs behaves completely as before.

By "redirect", you mean shell redirection?  Does that do any character
conversion?

  > I assume you know the options->mule menu in emacs, there is a lot to
  > help with encoding issues...

Yes, I know, but I don't see how set-input-method would fix this. Do you?

  > > As to the locale, where can I look that up ... ?
  > .bashrc

Neither ~/.bashrc nor /etc/bash/bashrc contain any locale setting
... hmm.

But very frankly, would the solution not focus on the kernel, at least
partly? As I said, I can reverse the phenomenon by simply booting the
old kernel!

Does nobody know where the kernel controls what the keys of the
console keyboard send when pressed?

(BTW, KEYMAP="de-latin1-nodeadkeys", in /etc/conf.d/keymaps.)

Regards, Florian





  reply	other threads:[~2009-02-28 17:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-27 17:29 [gentoo-user] Kernel update messed up console encoding Florian v. Savigny
2009-02-27 21:05 ` Sebastian Günther
2009-02-28 10:34   ` Florian v. Savigny
2009-02-28 11:34     ` Eray Aslan
2009-02-28 14:26     ` Sebastian Günther
2009-02-28 17:38       ` Florian v. Savigny [this message]
2009-02-28 18:48         ` Sebastian Günther
2009-03-01  9:36           ` Florian v. Savigny
2009-03-01 10:30             ` [gentoo-user] " Nikos Chantziaras
2009-03-01 12:25               ` Florian v. Savigny
2009-03-01 12:48                 ` Nikos Chantziaras
2009-03-02  1:01                   ` Florian v. Savigny
2009-03-02 11:29                     ` Nikos Chantziaras
2009-03-02 12:51                       ` Florian v. Savigny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ML2xA-1LdT9M2LrY-0006L9@mrelayeu.kundenserver.de \
    --to=lorian@fsavigny.de \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox