With some slight modification to the Anomy Sanitizer code it would be
possible to add customized extensions without patching the Sanitizer
directly. This would make my job *much* easier as I would not have to
rehack the scripts directly whenever I wanted to upgrade to the latest
version of the Sanitizer. (Or make my boss' job easier since I will
probably be in Grad school rather than here in 6 months.)
What I propose would require minor changes to ReadSanitizerConfigLine so
that an extension could have customized configuration directives added
to the sanitizer.cfg file. And a small change to Sanitizer would also be
required.
First, we need to add a configuration directive for loading these custom
extensions. Thus, we start with some set of directives similar to the
file_list directives. The first of which could be named "sanitizer_ext"
to name how many extensions are in the file. Then, several
"sanitizer_ext_#" directives which could be set to a string formatted as
"name:module" where the name is the internal name of the module and the
prefix of any custom directives. And module would be the name of the
Perl module to load. ReadSanitizerConfigLine could then except any of
the normal directives, plus the "var_" directives plus the "ext_"
directives for custom extensions. Now, any extension should use any
directives starting with "ext_name_" for it's own customized purposes.
Next, upon startup, Sanitizer should load any external Perl modules
defined as extensions. I'm not sure how the semantics of passing in
information ought to work, but somehow the extensions will need access
to the logger and other Sanitizer variables.
With these changes now in place, an extension can be written that hooks
log entries and can perform actions based upon those events. For now,
this seems to me to be about all that would be available to an
extension, but that ability alone would permit a lot of nice
customizations! Instead of building the scoring system into the
Sanitizer directly, it could be added as a module. This would require a
small amount of extra work at startup, but it could prevent lots of
extra work by features the user isn't using later on.
I suppose that the other possible action an extension might take is to
register custom parsers for scanning the email, but I don't know if that
would be a good idea since that's messing with the core of the sanitizer.
Anyway, that's just an idea of mine. If you think it's a good one you
might use, I could help with the initial coding--I'm pretty sure I can
justify it to my boss. If not, I guess my boss will probably just find a
version and stick to it.
I really like the new log hooks now that I understand how they work.
Thanks a lot for the help.
Cheers,
Sterling