anomy-bugs

base64 encodings with linelength % 4 != 0

From: Joerg Lenneis (16095@xyz.molar.is)
Date: Tue 06 Aug 2002 - 21:48:20 UTC

  • Next message: Bjarni R. Einarsson: "Re: base64 encodings with linelength % 4 != 0"

    Dear all,

    after encountering weird corrupted attachments from the Sanitizer, I
    dug into the source and ended up at the following function in
    MIMEStream.pm:

    sub DecodeBase64
    {
            my $reader = shift;
            my $line = shift;

            # This hacks the decoder to handle mangled Base64 text properly, by
            # properly ignoring white space etc. Note that this will lose the
            # last 1-3 bytes of data if it isn't properly padded. We also record
            # the encoded line-length, so we can re-encode stuff using the same
            # length.
            #
            if (!$reader->{"DecodeBase64llen"})
            {
                    $line =~ s/[^A-Za-z0-9\/+\012=]+//gs;
                    
    > my $nlpos = int((3*(index($line, "\012") + 1)) / 4);
                    $line =~ s/\012//gs;

    > my $llen = int((3*length($line)) / 4);
                    my $t = $llen;
                    $t = $nlpos if (($nlpos < $llen) && ($nlpos > 0));

                    $reader->{"DecodeBase64llen"} = $t;
            }
            else
            {
                    $line =~ s/[^A-Za-z0-9\/+=]+//gs;
            }
            $line = $reader->{"DecodeBase64"} . $line;
    > $line =~ s/^((?:....)*)(.*?)$/$1/s;
            $reader->{"DecodeBase64"} = $2;

            return decode_base64($line);
    }

    All lines marked with > presuppose that the linelength of a base64
    encoded attachment is a multiple of four. Unfortunately, the mailer
    that caused the corruption uses 70 characters (all other Mailers I
    have encountered so far use a line length that is divisible by
    four). Now, from my reading of RFC 2045 that seems to be
    OK. Unfortunately, I think, that the changes to the Sanitizer to
    accomodate this behaviour would be quite substantial, since there is
    an inherent assumption about a line never having anything to do with
    the following line. Any comments?

    -- 
    

    Joerg Lenneis 16095@xyz.molar.is



    hosted by molar.is