Looking for good regex tutorial
February 4, 2016 2:16 PM   Subscribe

I would like to learn regex and am looking for a good tutorial. Brief details inside.

I will be honest, I have tried to learn it before and found it difficult. The tutorials I have looked at suffered from 2 or 3 of the following problems:

1) Easy at the start and then they jump off the deep end

2) Too hard, and/or expect you to understand Linux

3) No exercises, or maybe one or two, and sometimes these are difficult to use as there was no relevant data - for e.g. one had an exercise to find emails in a large text document, but who has a large doc full of dummy emails to run it on?

I appreciate they are tricky and I am asking a lot. I need something that leads you from the beginning and gets slowly and progressively more intense with data and exercises to help you learn. Thanks for any advice.
posted by marienbad to Computers & Internet (13 answers total) 26 users marked this as a favorite
 
Here's a site with regex crossword puzzles. They're fun (if you're a geek) but it doesn't look like the "beginner/tutorial" ones actually attempt to teach you anything...they're just very simple.

https://regexcrossword.com/

Don't overthink regexps, though...the simplest regex is identical to a string compare, and the question is always "does /ABC/ match XYZ?"

(Obviously, /ABC/ doesn't match XYZ, but regex /XYZ/ does match XYZ. So there's lesson 1: if the letters are the same, they match.)
posted by spacewrench at 2:39 PM on February 4, 2016 [2 favorites]


I found the examples in this forum thread useful. I use the Bulk Rename Utility software to rename files. If you want to practice, you could install it (it is free) and then use some of the examples to preview how your file names would change if you applied certain regex patterns. You would not have to actually rename anything. The rest of that forum is good, too, but that thread stands out.
posted by soelo at 2:44 PM on February 4, 2016


I'll admit upfront that I read it so long ago that I don't remember if it did a good job of progressing from easy to hard, but I learned using the book Mastering Regular Expressions from O'Reilly and recommend it.

Building on what spacewrench said, the next important concept is that of metacharacters, which are basically a rich set of wildcards - similar to but much more powerful than the ubiquitous *. So /X\dZ/ would match X0Z, X1Z, X2Z, ..., X9Z. This is because the metacharacter \d matches any numeric digit. * means to match any number of something, so \d* would match any number of digits from 0-infinity.
posted by duoshao at 3:33 PM on February 4, 2016 [1 favorite]


What little bit of REGEX I know I have learned from regexr.
posted by jmsta at 3:34 PM on February 4, 2016


It's important to know that there are lots of different "flavors" of regex -- once you get past the basics, different implementations support different features. The Perl implentation (also used in PHP) in particular has a bunch of features that may not work elsewhere.

If you want to practice without figuring out the command line, I suggest getting Sublime Text (cross-platform) and using the regex search there (tick the ".*" button by the search field). It'll helpfully highlight your results in real time.
posted by neckro23 at 4:02 PM on February 4, 2016


I've always found this site's organization and presentation of material very clear, and so I use it as my go to reference. YMMV.

Also, seconding the advice to watch out for regex dialects. For instance, this is the kind of mistake I make when I'm in a hurry:
user@server:~$ grep -i "(t|r)ing" temp.txt
user@server:~$
when I actually meant to also specify the switch for Extended Regex dialect:
user@server:~$ grep -iE "(t|r)ing" temp.txt
testing
spring
user@server:~$

posted by forthright at 4:47 PM on February 4, 2016


Best answer: I learned from RegexOne.
posted by kevinbelt at 7:07 PM on February 4, 2016 [1 favorite]


My favorite resource for regexes in general is txt2re.

It won't teach you anything directly, but by playing around with it, you can see how to make regexes out of text that you know and understand, and it lets you start simple and get more complex on your own terms.
posted by Dilligas at 7:33 PM on February 4, 2016


There's a bit of a gotcha in what duoshao said. * matches zero or more of whatever it is modifying, so in the case of X\d*Z, XZ would be matched. What you actually want if you want to require digits between the X and Z is \d+ which will match one or more digits.

One of the unfortunate things about regexes in general is that there are different dialects. Perl's regexes are a bit different from grep, for example. Thankfully, the basics are the same between them.
posted by wierdo at 8:10 PM on February 4, 2016


Perl regexes are probably the most common dialect. You can say "grep -P …" to make grep use the perl dialect. The Perl documentation for regexes is pretty good.
posted by monotreme at 10:07 PM on February 4, 2016


Regexes are a difficult beast under the best circumstances, and honestly, I try to avoid using them in programs when I can. I don't really have any suggestions about exercises or tutorials, but there are a few things that have helped me:

Visibone Regexp quick reference - This is for the Javascript flavor of regex, which is similar enough to perl-style to be at least somewhat applicable. The nice thing about this reference is that it shows you how the various pieces of syntax can fit together. It's not comprehensive

Regexr (already linked above) - This regex builder visualizes the various parts of the regex and shows matches in realtime. While it's not a strict replacement for exercises, if you have a specific problem you're trying to solve with regexes, it's invaluable to have a tool like this that can help you verify the regex.

The Regular Expression Cookbook focuses on examples of things you can do with regular expressions. In general, I've found that starting with a specific problem to solve and then working to understand how a given regex solves that problem does a lot more for teaching me.

In the end, though, there is no real substitute for practice.
posted by Aleyn at 1:02 AM on February 5, 2016


Best answer: I am where you are.

This article is aimed at us:
http://recompilermag.com/issues/issue-0/beginning-with-regular-expressions/

There are more than a dozen iApps offering regex testers, and I bet the same is true for other platforms (but they're truly useful only after you've decided how you wish to modify your text.)

The Mac word proc "Nisus" offers a learning regex mode, choosing items like "any lowercase character" from a menu and then pasting the relevant patterns into the search/replace box.
posted by Jesse the K at 6:35 AM on February 5, 2016


"Regexes are a difficult beast under the best circumstances, and honestly, I try to avoid using them in programs when I can." -- Aleyn
I cannot agree with this sentiment, at all. Regexes are a wonderfully expressive and efficient tool that make quick and simple work of many, many problems that would be difficult or time-consuming to solve without them.

I use regexs every day, in many different languages. Every search-and-replace I do in my text editor means writing a s/// command. (The dialet differences are annoying, but you will eventually sort out the different implementations)

I have often thought, "Damn! If only this crappy language had regexes available, this would be so simple! Now, how the hell am I going to do it without them... grr, this is going to take an hour to code. ...I wonder if there's a 3rd-party regex module I can install..."

I have never thought, "Damn, now I have to uses regexes again :( This would be so much easier with an XML parser, or sscanf()."

If you're not on-board with the above, then I strongly encourage you to make a real effort toward learning about and practicing regexes. You will thank yourself. You will save yourself a lot of time. And you will come to love regexes as much as I do ;)

As for practical advice... If you think of regexes as difficult and impenetrable, perhaps it would help to start with something very similar but also much simpler:
Lua Patterns are
not regexes, and have some serious short-comings by comparison, but they do work very similarly, and should be easier to approach while you get acclimated.

posted by teatime at 3:49 AM on February 6, 2016


« Older A Week in Paris and London   |   There's lowball, and then there's LOWBALL Newer »
This thread is closed to new comments.