anomy-list

Re: A few more hints on using perl regexps.

From: Dave Cridland (29788@xyz.molar.is)
Date: Wed 09 Jan 2002 - 12:47:39 UTC

  • Next message: Dave Cridland: "Sanifilter, version 1.48.2, released."

    On Mon, 2002-01-07 at 19:43, Bjarni R. Einarsson wrote:
    > On 2002-01-07, 14:33:07 (-0500), Brett Simpson wrote:
    > > What effect does the asterisk have as opposed to the first>
    > > expression? Thanks.
    > >
    > > (?i)\.(exe)$
    > >
    > > and
    > >
    > > (?i)\.(exe)*$
    >
    > An asterisk matches zero-or-more of the preceding entity (be it a character,
    > wildcard or something enclosed in parenthesis).
    >
    > So the first would match only "blah.exe", but the second would match
    > "blah." and "blah.exe" (and "blah.exe.exe.exe.exe.exe").

    Or more accurately, blah.exeexeexeexeexeexe - the dot isn't part of the
    preceding entity.

    Note that Bjarni doesn't mention the commonly used ?, which means 0 or 1
    matches.

    So (?i)\.(exe|bat|pif)(\.(gz|bz\d?))?$ matches things like:

    blah.exe

    blah.exe.gz

    blah.exe.bz2

    blah.bat.bz

    blah.pif.gz

    (The "\d" means the same as "[0-9]" - any numeric character.)

    Note that a "." matches any *single* character, just to clarify Bjarni's
    explanation, and a regexp matching any sequence of characters can be
    written as ".*". A character can, indicentally, be anything - spaces,
    tabs, and indeed any ASCII code is considered a character. AFAIK, Anomy
    Sanitizer will only handle 8-bit/character character sets, which
    probably covers you for any email you get.

    Dave.



    hosted by molar.is