Perl regular expression question
December 21, 2005 2:04 PM
Subscribe
Perl regular expression question inside. Trying to parse a list of items...
Ok, I have a file listing things formatted like this:
Items foobar:
a1
a2
a3
a4
Items foobaz:
b1
b2
b3
I'm trying to use a regular expression to determine whether a given item comes after foobar or after foobaz. I'm doing something like this:
$variable = "b3";
$text_of_file =~ /Items (\S+?):.*?$variable/;
print "$1\n";
I figured that adding a ? after the * to make it non-greedy would mean that it would print "foobaz", but unfortunately it's printing "foobar".
Can someone suggest a better way to do this? It occured to me that I could split the list up into sections using something like:
@sections = split(/foob\S\S/, $text_of_file)
but that seemed like a lame hack, and it seems like you should be able to easily do this using a regex.
posted by pornucopia to computers & internet (8 comments total)
For your simple example, changing .* to [^I]* does the trick, but I don't know how to extend this to negative string matching offhand. I'd probably just look for the variable, then take $PREMATCH and look for the last items header in that, but I remember reading that this is a bad idea, performance-wise.
While you call the split approach a lame hack, I wouldn't really dismiss it - it may well be faster than the complicated regex approach, and maintaining such code is mostly easier than peeling apart giant monster regexes.
If you care about performance, I recommend some toying with Benchmark.pm to find out real numbers, though it's probably not going to save a program that reads its data from text files line by line.
posted by themel at 2:28 PM on December 21, 2005