anomy-list

Re: The QP encoding issue - again!

From: Noel Clarkson (103025@xyz.molar.is)
Date: Tue 18 Nov 2003 - 23:42:33 GMT

  • Next message: Bjarni R. Einarsson: "Conditional filtering, adding headers, etc."

    Hi Bjarni,

    thanks for the work on this, anything that reduces the pain will be a
    good thing (and whilst I agree that we should just get people to use non
    broken mail clients, it looks like a big task from here!!). You mention
    that there is a way to do this using the next level up (procmail etc)
    but that it requires more io. I'm not that worried about the extra
    overhead but have not been able to work out how to do this, and from
    comments on the list I understand that there are a number of others in
    the same boat. I've managed to peice a few things together, but don't
    seem to see the whole picture and so can't get this to work without some
    more pointers. The system I'm on has Mac, Unix, and Windows clients and
    so any fix that bends to only one of these will cause greif for the
    others, but my understanding is that at least if I do the leave the
    message untouched thing using procmail then if people complain I can
    tell them that the problem is broken email clients and they can't
    complain (at the moment some of them get the same message at home as at
    work and it only breaks on the work computer (but I'm not going to stop
    scanning email at work)). Would it be possible to give an example of
    how to use procmail (or any of the other next step up things - I could
    try and work procmail out from there) to overcome this issue.

    Don't want to add to your workload, and I'm happy if someone else can
    respond instead (but similar requests in the past from various people
    haven't got far), was hoping to post the solution that I came up with
    when I was working on this a while ago, but as I said, never got it to work.

    cheers,

    noel

    Bjarni R. Einarsson wrote:
    > On 2003-11-17, 14:50:05 (-0600), Dustin Puryear wrote:
    >
    >>Bjarni, we had actually considered this very solution. That is, that if a
    >>binary file is QP encoded to assume that it was done by a Windows client.
    >>However, we decided to just try and stop Anomy from making the CRLF/LF
    >>conversion in the first place. Good idea?
    >
    >
    > No, because it won't work. Anomy isn't doing the CRLF/LF
    > conversion (corruption), the QP standard is. There IS some other
    > CRLF/LF stuff going on in Anomy, but it's not the cause of the
    > attachment corruption. The other stuff is Anomy trying to
    > compensate for the fact that some MTAs will give it data using a
    > CRLF newline convention, while others will give data using an LF
    > convention.
    >
    > To illustrate why this is all the QP standard's fault, consider
    > this three-character binary string:
    >
    > <CR><LF><CR>
    >
    > It actually has THREE different valid QP encodings, depending on the
    > source machine's newline convention:
    >
    > Unix: =OD<CR><LF>=0D (The CRs get encoded, LF becomes CRLF)
    > DOS: <CR><LF>=0D (The trailing CR gets encoded)
    > MAC: <CR><LF>=0A<CR><LF> (The LF gets encoded, CRs become CRLF)
    >
    > Now, if we decode each of those encoded strings using a Unix line-feed
    > convention, we get these results:
    >
    > Unix->Unix: <CR><LF><CR>
    > DOS->Unix: <LF><CR>
    > MAC->Unix: <LF><LF><LF><LF><LF>
    >
    > Fun, isn't it? :-)
    >
    > This simple example explains why sending binary data QP encoded is
    > a broken idea. As you can see, if a Windows or Mac user sends a
    > Unix guy a PDF binary which has been QP encoded, then the file will
    > probably not get decoded properly - even without Anomy being
    > involved. In general, sending binaries from one OS to another using
    > QP encoding just won't work.
    >
    > However, Anomy does magnify the problem and cause it to surface
    > more often. This is because internally, Anomy assumes Unix
    > linefeed conventions at the moment (this assumption is inherited
    > from the CPAN QP module), which means it causes new problems when
    > Windows users send QP encoded binaries to other Windows users. This
    > is probably by far the most common case, so brekaing it causes
    > quite a bit of grief. But keep in mind that it will also cause
    > problems when Mac users send QP coded binaries to other Mac users.
    >
    > Since Windows is the most common (and quite possibly also the OS
    > with the most brain damaged mail clients), switching Anomy to Windows
    > newline conventions internally will decrease breakage quite a bit.
    >
    > I may also add logic to try and detect the operating system of the
    > mail sender from message headers, to automatically guess when to
    > use Mac or Unix newline conventions instead... I'm not sure. But
    > this all boils down to guesswork and making the best of a bad
    > situation - the only real solution is to get people to upgrade to
    > proper mail clients which don't QP encode binaries.
    >
    >
    >>Anyway, I would love to see this code and the release that uses it. Good
    >>work and thanks!
    >
    >
    > I'll do my best to make it available as soon as possible. :)
    >
    > --
    > Bjarni R. Einarsson PGP: 02764305, B7A3AB89
    > 103070@xyz.molar.is -><- http://bre.klaki.net/
    >
    > Check out my open-source email sanitizer: http://mailtools.anomy.net/
    > Spammers, please send lots of mail to: 103104@xyz.molar.is
    >
    > Was I helpful? Let others know:
    > http://svcs.affero.net/rm.php?r=Juggler
    >
    >



    hosted by molar.is