Re: Bug With Newlines and PDF Attachments

From: Bjarni R. Einarsson (
Date: Mon 08 Sep 2003 - 20:56:27 GMT

  • Next message: Stephen J. Schmidt: "RE: Bug With Newlines and PDF Attachments"

    On 2003-09-08, 11:39:51 (-0700), Stephen J. Schmidt wrote:
    > received recently have been corrupted. I traced the problem to
    > something in the f-prot software adding an extra \r (carriage return)

    Strictly speaking, this is not the F-Prot software itself, but by
    the Anomy modules which are an open source product sponsered by
    F-Prot. Not that it makes much of a difference in this case - I
    (the Anomy author) work for FRISK and will of course do my best to
    fix any problems like this. :-)

    > character before several of the \n (newline) characters in the
    > attachment. I think that the Anomy software is incorrectly detecting
    > that the binary PDF file is a text file whose newline characters it can
    > change. If I'm right about this problem, I think it's very dangerous
    > for the F-Prot software auto-detect the newline convention like this,
    > because it is obviously wrong in many cases.

    This is a known problem which is caused by certain mailers
    incorrectly encoding PDF files as if they were text, using
    quoted-printable encoding, which is not safe for use with binaries.

    Anomy has no choice but to "auto-detect" the newline convention in
    these cases because quoted-printable encoding does exactly that -
    replaces actual newlines with "semantic" newlines. This is a
    "feature" of QP encoding meant to make sharing text between systems
    with different newline conventions easier. But this of course
    causes major problems when used on binary attachments.

    Anomy literally has no way to know what the newline convention of
    the incoming data is, and any assumption we make is going to be
    wrong in some cases. Such is life when dealing with broken e-mail
    clients. :-/

    A partial solution would of course be to change the Anomy code's
    behavior so it has the same idea of what a "newline" is, as the
    broken mailers causing this problem - but that will mean messages
    sent by other systems with other ideas of what newlines are will
    be broken in the same way.

    I'll look into this in more detail tomorrow and discuss it with my
    collegues at work.

    Bjarni R. Einarsson                           PGP: 02764305, B7A3AB89                -><-    

    Check out my open-source email sanitizer: Spammers, please send lots of mail to:

    Was I helpful? Let others know:

    hosted by