Hi all,
I'm sorry I haven't been able to participate in the QP thread going
on recently - I had a business trip abroad last week and had very
limited access to email during the trip - and for about a month
before it I was swamped with preperations.
I wish I had spoken up sooner though, since I probably could have
saved y'all some time and effort - because right before getting
swamped with other work I did quite a bit of hacking on this very
problem.
First of all - because of ambiguities in the QP spec, there is no
algorithm for decoding/encoding which will always result in the
output being identical to the input, short of simply storing the
entire input and using it verbatim. No matter how you tweak the
newline handling of Anomy, you're never going to get it right!
My strategy was basically to emulate the Windows CRLF conventions
during rewriting, since all the broken clients I know of which are
encoding binaries using QP are on Windows. (For non-binaries what
CRLF convention Anomy uses internally doesn't matter). I have
working code which does this, and it does seem to help.
This would still result in corruption of binary files QP encoded
using Mac or Unix newline conventions, but such files should be far
rarer and this behavior should cause much fewer problems. The
current Anomy release uses unix-newlines internally, because that's
what the CPAN perl module does. This not only causes binary
corruption, but is has security implications as well, which is why
I'm working on changing the behavior.
Hopefully I should be able to release a new version of Anomy
sometime this week for you to test.
Regarding Dave Cridland's earlier message, that Anomy is buggy and
shouldn't be decoding/recoding attachments which it isn't going to
modify... well, that would be true if Anomy knew ahead of time if it
was going to modify a file. But the fact is, that it doesn't always
know this ahead of time and since the program is expected to process
it's input as a stream (not a file it can seek back and forth in)
there's not much which can be done about this.
For those of us who are virus-scanning all attachments, then we
don't know whether files are clean or not until they've been decoded
and scanned - and in order to avoid the QP problem in such cases, it
would be necessary to store the encoded attachment on disk somewhere
as well, and re-inject that into the processing stream instead of
re-encoding.
Considering the additional I/O involved in doing that, I've decided
that isn't worth the effort - it costs about the same in
performance, is far simpler and at nearly as effective to simply
store the entire undecoded message and resend it completely
unmodified if Anomy doesn't make any "critical" changes.
This strategy however is best done at the next level up (by the
tools which invoke Anomy), not within the Anomy code itself.
So... the problem is probably never going to be solved completely
within Anomy. However, I really do expect the fixes I mentioned
above to make a big difference, so please help me test them as soon
as I have time to get them released. :-)
-- Bjarni R. Einarsson PGP: 02764305, B7A3AB89 102013@xyz.molar.is -><- http://bre.klaki.net/Check out my open-source email sanitizer: http://mailtools.anomy.net/ Spammers, please send lots of mail to: 102177@xyz.molar.is
Was I helpful? Let others know: http://svcs.affero.net/rm.php?r=Juggler