Help with regex exp. for Yahoo Pipes
April 3, 2009 4:40 PM   Subscribe

I'm a librarian who uses Yahoo Pipes in combination with RSS feeds for new books from Amazon to order books. The title tags in each individual item begin as in this example: "#1: A. Lincoln: A Biography". I would like to add a regex expression that will strip out the ranking (#1: or #10:) out of the title so that I can sort the book titles. What I want to remove is the # sign, the ranking number (one or two digit number), and the colon. Any help would be appreciated.
posted by jingo74 to Computers & Internet (4 answers total) 1 user marked this as a favorite
 
#\d+: would find that pattern if that helps…
posted by Ginkgo at 4:55 PM on April 3, 2009


You might try adding a regex filter where you choose the item title and replace ^#\d{1,2}: with nothing. That should match the number sign followed by one or two digits and a colon at the beginning of the title.
posted by pb at 4:57 PM on April 3, 2009


I don't know Yahoo Pipes, but I can tell you how I would do that with Perl's regular expressions, and I'd do it like this:

$a = "#1: A. Lincoln: A Biography";
$a =~ s/^#\d+:\s*// ;
print "$a \n";

Which says (in short) find the pattern ^#\d+:\s* (i.e. start of line, # sign, 1 or more digits, colon, zero or more spaces) and replace it with nothing.

Unfortunately I don't have a yahoo account, and I can't find much documentation for yahoo pipes regular expressions.
posted by Mike1024 at 5:42 PM on April 3, 2009


If you can't do replacement but grouping works (I, too, am ignorant of Pipes), /^#\d+:\s*(.*)/ should give you everything after the colon and adjacent whitespace (the title, in your example) as $1 or the Yahoo equivalent.
posted by dreadpiratesully at 8:09 PM on April 3, 2009


« Older Public toilets -- which is the cleanest and which...   |   I am god for a short time, but then I'm toast. Newer »
This thread is closed to new comments.