Numbered SGML entities in header addresses
felixs
besteck455 at gmail.com
Tue Apr 9 11:11:59 UTC 2019
On Mon, Apr 08, 2019 at 08:40:09PM -0700, Ian Zimmerman wrote:
> On 2019-04-07 23:13, felixs wrote:
>
> > > From: "Foo Bariì" <foo-baric at gmail.com>
> > >
> > > where the entity refers to the character U0107 in Unicode code point
> > > space. I would like to automatically see the correct glyph at least
> > > when it is in one of the visible headers. Is there a display filtering
> > > feature in mutt that would allow me to do that (I don't mind if it
> > > requires a bit of configuration)?
> >
> > And if you add
> >
> > set charset="utf-8"
> >
> > to your muttrc conf file?
>
> That doesn't look at all plausible to me. For one thing, UTF-8 is the
> systemwide default, meaning it ends up in my LANG and LC_ variables. I
> am as sure as I can be about anything that mutt picks up those if the
> "charset" mutt variable is not set.
Yes, you are right, mutt reads the LC_* variables and is usually able to
represent characters in utf-8 if that is set by them. But in case of
problems, as I thought you might have, it may be a help to explicitly set it.
> For another thing, why should it help? Those ASCII characters are
> perfectly valid in the name part of a From header, and normally I expect
> mutt to show them to me as they are. It is only in this case where some
> HTML-addled MUA decided to use them together to encode a ISO 8859-2
> character (_not_ UTF-8 or anything related) that I want a way to see the
> character really intended by the sender.
You have asked for a "display filter" setting in mutt to be able to see
the "real" character, which is a character that is part of the Unicode
Database. Even if the message you supposedly received was encoded in
ISO-8859-2, mutt, when opening the message, would convert it into Unicode
(usually, utf-8) if your LC_variables
are correctly set to use it. Or, see above, set them explicitly to be sure.
Please take note that I did not reproduce your issue. So I actually do
not know why this happens in your case. Do you have some more information?
To know, by other means, what the intended character was, in *Python* you
might use the chr() function. Given the fact that chr() works with
integers, you first have to convert the hexadecimal into an integer.
chr(int('0x0107', base=16))
Maybe I can find some other way using just mutt's conf options.
Patience, please. :-)
> Nevertheless, your suggestion should do no harm, so I'll try it and
> report back.
Ok.
Cheers,
felixs
More information about the Mutt-users
mailing list