Filter script to remove html, fullquotes and header lines

Cameron Simpson cs at cskk.id.au
Sun Mar 20 21:46:52 UTC 2022


On 20Mar2022 13:36, Martin Trautmann <traut at gmx.de> wrote:
>do you know about any mutt script that would go from message to message 
>and
>
>1) remove a html part if a plain text part is given
>
>2) remove all trailing lines,
>   starting with a quote sign ">"
>   and at least e.g. 10 occurences
>
>  such as (^>[.*][\r\n]){9,} before the end of the message
>
>  Maybe I could append xzxzxzx to the end of the message first, delete 
>a fullquote up to there and remove xzxzxzx again?
>
>  Bonus: Do not remove fullquotes for messages without in-reply-to or 
>references headers.
>
>3) remove header lines which are longer than 5 lines
>
>I want to shrink the size of some mailboxes for archive purposes, 
>without throwing away too much.

I think you'll have to write your own.

At minimum you need a full mail message parser so that you are not 
filtering, say, base64 or QP content incorrectly. So something which 
scans a mailbox and for each message:
- decodes it completely
- applies your filters
- assembles the new message
and write this out to a new mailbox (so it isn't destructive and can be 
compared to the original - you don't want to accidentally shred your 
archive).

I'd do this in Python myself - it has a good email library and you can 
do all the things you describe fairly easily with it.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Mutt-users mailing list