Re: Defanging HTML, pros and cons

From: mark david mcCreary (
Date: Þri 22 Ágú 2000 - 12:37:57 UTC

>Currently the only serious problem with the HTML defanger, is that
>it is a little too sensitive and may defang stuff that isn't
>strictly HTML - all text/* parts are scanned for HTML. This is
>actually a relatively complex issue, since it's so hard to tell
>what mail readers will interpret and what they won't - currently
>the sanitizer is pretty sensitive/strict, since that is the safest
>way to do it. But it is also the most disruptive for normal use.


I would prefer to be conservative and defang too much stuff, than not enough.

With mailing lists, I want to defang the HTML but not reject it (in some

Therefore, I saw the need to run your sanitizer as a seperate pass for
defanging HTML, with the bad score set very high, so that the email will
not be rejected.

>At the moment I'm considering adding a "feat_html_strict" variable,
>which would allow people to choose whether to HTML-sanitize all
>text parts (the current behavior) or only parts which either have
>the right MIME type or contain recognizable HTML tags near the top
>of the file.
>A third option would be to only HTML-sanitize inline HTML stuff,
>leaving attachments alone. This would be pretty insecure, but
>still better than nothing for lists where people are actually
>swapping complex HTML files.
>Does this sound like a good compromise?

That sounds good to me.



