Re: UUencode bug. UUencoding being corrupted in 1.48

From: Bjarni R. Einarsson (
Date: Fri 08 Feb 2002 - 12:55:31 UTC

  • Next message: Brian Schonecker: "Error when checking a new install of sanitizer"

    On 2002-02-08, 11:24:57 (+1100), Peter Williams wrote:
    > G'Day,
    > I email to bugs however I figured I should email here too.
    > I've upgraded to 1.48 from 1.45. since then I've found that some
    > UUencoded attachments are being corrupted.

    I'm looking into this and a few other uuencode-related issues at the
    moment. Sorry I didn't reply to your initial bug report, I get alot
    of mail.

    Your problem is basically that your uuencoded attachments aren't being
    recognized as uuencoded by the sanitizer, because the begin line is
    "begin 0 blabla" instead of "begin 000 blabla". Since the sanitizer
    doesn't recognized the uuencoded content as being uuencoded, it
    proceeds to HTML-defang it uuencoded content, leading to the problems
    you've described.

    This can actually be considered a feature, not a bug, since if the
    sanitizer doesn't detect the uuencoded attachment, then it may have
    avoided your security policies and corrupting it is the "safe" thing
    to do. ;-) There's always a silver lining...

    Anyway, improvements I'm considering for 1.49 to deal with this, and
    related issues:

     - Make the uuencode detection accept single digit file modes or even
       null modes (begin filename), instead mandating 3 or 4 digit modes
       as it currently does. This makes it compatible with what some
       (all?) stupid versions of Outlook do.

     - Make the uuencode detection perform a look-ahead, to check if the
       following line really is uuencoded content, to decrease the odds of
       falsely entering uuencoded-mode. *If* a "begin mode name" is
       detected but not teally followed by uuencoded content, then escape
       the line so Outlook won't think it's an attachment.

     - Possibly disable HTML defanging of unrecognized/overlong tags
       within parts which aren't clearly recognized as HTML.

    An explanation of this silliness (and some other rather funny
    information as well) is here:

    Bjarni R. Einarsson                           PGP: 02764305, B7A3AB89                -><-    

    Check out my open-source email sanitizer: Spammers, please send plenty of email to:

    hosted by