Composing in utf8 from latin1 terminal

nunojsilva at ist.utl.pt nunojsilva at ist.utl.pt
Thu Oct 25 11:53:43 UTC 2018


On 2018-10-24, Derek Martin wrote:

> On Tue, Oct 23, 2018 at 10:31:45PM +0100, Nuno Silva wrote:
>> On 2017-10-12, nunojsilva at ist.utl.pt wrote:
>> 
>> > Recently, I have tried to use mutt on a non-utf8 terminal.  Everything
>> > works as expected in an utf8 environment, but when I compose new e-mails
>> > in a latin1/ISO-8859-1 terminal, mutt will expect the file to be in the
>> > same encoding as the terminal, while my text editor will save the file
>> > in utf8. The result is that non-ASCII characters get misinterpreted,
>> > which can affect the message headers as well (e.g. real names in To: and
>> > Cc:).
>> [...]
>> > Is there some way to configure mutt so that it always uses utf8 to read
>> > the new message after I exit the editor? Or a way to enable some
>> > encoding autodetection that can tell utf8 apart from latin1?
>
> The bottom line is that your environment is misconfigured.  If you
> want this to work, you need to have LANG set properly at every point
> along the execution path.  Your terminal, terminal font, editor, and
> Mutt all need to know that you're using latin1 instead of Unicode, by
> having been started with a latin1 LANG setting.  You may need to
> configure your terminal to use the correct font, although with many
> modern terminals (like gnome-term, kterm, etc.) this should be
> unnecessary.
>
> If you are launching the latin1 terminal from a shell that has its
> LANG set to UTF-8, it could break (an example of this is starting
> hanterm, a terminal program expressly for Korean input with EUC-KR,
> with a UTF-8 locale--won't work).  If the shell running inside the
> terminal has LANG set to UTF-8, both Mutt and your editor could break.
> If you have manual settings on any of these programs to override the
> locale defined by the environment, it could break.  If you don't have
> all of these things set the same way, it could, and almost certainly
> will break.  Sometimes the breakage is subtle, e.g. if you dump the
> right characters to a terminal (say, with the cat command) tht has the
> right font, it will generally display them correctly, even if the
> locale is wrong.  But using them with programs that need to know the
> locale will still break.
>
> If you're using a Mutt setting to connect to an existing emacs
> instance (via emacsclient or similar) that's already running in a
> UTF-8 locale, that's broken.  You need to start a new instance of
> emacs whose locale is latin1.

I haven't noticed this before, but there *is* indeed a difference when
starting a fresh new Emacs instance instead of connecting to an existing
one using emacsclient: the new instance does use latin1 to read/write
files. (That is, the behaviour expected by mutt.)

When I use emacsclient, the interface locale is not broken: the terminal
I/O encoding is correctly set from the locale. The only difference (that
I know of) is that Emacs will use utf8 to read/write files. If this
should match the terminal encoding, then it *is* broken.

I might be happy with the way things are now (as my files are usually in
utf8, and mutt is the only context where I need the file encoding to
match the terminal), but I won't claim it isn't broken if it is.

> Lastly, you may need to adjust send_charset in Mutt.  It can have
> multiple locales, and Mutt will pick the first one that your document
> can be displayed in.  For example, mine is:
>
>   set send_charset="iso-8859-1:utf-8"
>
> If my e-mail contains no characters that need UTF-8, Mutt will choose
> to send the message as iso-8859-1, but otherwise as UTF-8.
>
> If you do those things, it should "just work" and if you don't it
> won't, at least without jumping through pointless hoops to force it,
> which will most likely just break other things.

send_charset appears to be working correctly here, I've checked it a
couple days ago. It isn't even set in any configuration file, so I
suppose it is using the default setting.

For now, I will leave the Emacs hack in place, as I prefer to use the
Emacs "server instance" instead of creating a new one. Everything else
is hopefully correctly set, as this has been the only encoding problem
I've had in the past months. (Now that I've said this, I will probably
discover a new one tomorrow...)

-- 
Nuno Silva



More information about the Mutt-users mailing list