Is there a regular expressions standard?
April 21, 2004 5:09 PM   Subscribe

Regular Expressions: Is there a standard? [more]

Many programming languages and technologies support pattern matching in text using regular expressions. Is there a published industry standard somewhere for their syntax and implementation?
posted by normy to Computers & Internet (7 answers total)
 
Not only is there one standard, there's lots of them! *grin*.

The two most common are PCRE (Perl Compatible Regular Expressions, apart from perl5 there's also a C library that handles these) and POSIX regcomp/regexec regular expressions.
posted by fvw at 5:25 PM on April 21, 2004


Oh, and POSIX regexes come both in the "old" and "extended" flavour. Why do you ask?
posted by fvw at 5:27 PM on April 21, 2004


A good number of APIs allow you to chose what flavor of regular expression you want to use. There is enough variation among different implementations that even if you chose (for example) PERL style regular expressions in two different libraries it's not guaranteed that you'll be able to move a bunch of rergexps between libraries with no effort.

As far as implementation is concerned I assume (because it's reasonable) that all regular expression libraries build state machines to recognize regular expressions. If you're interested in the theory behind how it's done check out an introductory book on compilers.

For information on how to use regular expressions there's an O'reilly book called "Mastering Regular Expressions". I've never read the thing but I know it exists.
posted by rdr at 6:18 PM on April 21, 2004


For information on how to use regular expressions there's an O'reilly book called "Mastering Regular Expressions". I've never read the thing but I know it exists.

i'm working my way through it now (the 2nd edition) and it is excellent. i highly recommend it.
posted by lescour at 7:13 PM on April 21, 2004


There is enough variation among different implementations that even if you chose (for example) PERL style regular expressions in two different libraries it's not guaranteed that you'll be able to move a bunch of rergexps between libraries with no effort.

A-men. Playing with perl compatible regexps in PHP was an eye-opener. I lost hours to subtle gotchas.

The question is: why is this so? I mean, perl is open source, right? Why is it that you can't just grab the perl engine and be done with it?

there's an O'reilly book called "Mastering Regular Expressions"

I cut my perl teeth on this book. Before I read it, I knew how to write C code that perl would parse. When I was done, I actually knew some perl. Not to mention something about regexps, and honestly, sometimes I'm not sure how people get by without them.
posted by weston at 7:49 PM on April 21, 2004


I wouldn't call Perl's regexs a "standard". Well, okay, it's a minor de facto standard. Note, however, that the new Perl will have a redesigned regex engine and interface.

I highly recommend O'Reilly's Mastering Regular Expressions. I'm hard pressed to think of any other computing tool that is as useful. So many things can be easily and quickly done with a well-crafted regex.
posted by Ethereal Bligh at 1:19 AM on April 22, 2004


just fyi - if anyone is interested in using the ideas behind regular expressions in a more general way, try googling for papers on "parser combinators". i'm working on a parser combinator library at the moment to parse a language and it's amazing how elegant the approach is (i have a basic set of combinators that are parameterised by two functions - join and erase tokens - and from that i can generate both (context independent) lexers and (ast constructing) parsers. sweet sweet code).
posted by andrew cooke at 6:10 AM on April 22, 2004


« Older Movable Type-Friendly Webhost for $0-5/mo., Apache...   |   Where do I find pants in my size (30" waist)? Newer »
This thread is closed to new comments.