How to send with charset=iso-8859-1 instead of unknown-8bit?

Derek Martin invalid at pizzashack.org
Sun Jun 24 03:04:27 UTC 2018


On Thu, Jun 21, 2018 at 10:42:58AM +1000, mutt at raf.org wrote:
> Ian Zimmerman wrote:
> > My guess is that mutt looks at the locale environment (LANG and LC_*) to
> > set the encoding of the source data, and tries to recode it into one
> > of the encodings in send_charset.
> > 
> > If you _know_ your data is iso-8859-1 but your LANG etc. is something
> > else, try changing LANG locally in your driver script/program.
> 
> Thanks. I'll try that. Mutt probably detects that it isn't valid utf-8,
> and so doesn't match the system locale, and so can't be converted.
> That would make sense.

Yes.  Mutt assumes that its input is correctly encoded according to
the system's configured locale.  The conversion you're not clear about
is that when it sends a message, it tries to convert from your
configured locale, whatever it is, to each of the character sets in
the send_charset variable, in order, until it finds one for which the
conversion does not fail.  It sends that converted version.  The
default setting therefore should usually guarantee, for all
English-speaking users and many non-English-speaking European users,
that the outgoing e-mail will be encoded in the simplest encoding
possible, given its contents; i.e. it will send US-ASCII if it can,
then iso-8859-1, unless full Unicode support is required to represent
the data.

Your problem most likely is exactly what you guessed:  Your locale is
UTF-8, but the data is not valid Unicode, so all conversions failed.
Mutt just sends the bytes you fed it, and it appears in your case it
failed to default to a reasonable character set (rather than
$charset), for whatever reason related to your combination of locale
and Mutt settings.

> I wonder if setting charset to iso-8859-1 would also fix it.

This might work, but it's not the "right" fix...  The right fix is to
make sure that the data you're sending actually matches your system's
locale settings.  Note that given the default send_charset settings,
it should be possible for you to actually use UTF-8, and have mutt
convert the e-mail to iso-8859-1, if you really want that for some
reason, since it will try to use the first matching charset in your
send_charset to which it can successfully convert the data, as I
described above.  I used to do this for Korean that I drafted in UTF-8,
since at the time a lot of Koreans still had systems (Win98) that only
supported EUC-KR (WinXP had been out for years, but some people are
extremely slow to update their systems)...

But if you're using Unicode locally, why not just send it as UTF-8 and
be done with it?  These days it should be just about impossible to
find people using e-mail on systems that can't handle Unicode.  And if
the only reason is that the data is already in ISO-8859-1 and you
don't know how to convert it, that's easy to fix:  Just use the iconv
command (iconv is both a C library and a system command).  You can
just convert it from iso-8859-1 to en_AU.UTF-8 once and stop mucking
with incompatible charater set settings.  See the man page for
details, but it's pretty simple... you just specify the input
character set and the output character set.

> That's for the terminal so it's probably not wise to change that.
> Maybe assumed_charset? (maybe that's only for incoming messages).

You should really never set charset explicitly.  If your system is
configured properly, there's virtually never any need to do it, as
Mutt will correctly use your system's locale settings, which would be
the preferred way to make sure things are set up correctly.  The main
exception is if you have a large pile of pre-existing data that's in
some other charset besides the one you use, which you'll use in some
fashion other than typing it in manually, and converting it would be
prohibitively costly.

-- 
Derek D. Martin    http://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mutt.org/pipermail/mutt-users/attachments/20180623/fa561fa3/attachment.asc>


More information about the Mutt-users mailing list