I need to write a macro to reformat daily reports, which can't be changed upstream, using regex and global replace.
March 15, 2010 5:49 PM   Subscribe

I need to write a macro to reformat daily reports, which can't be changed upstream, using regex and global replace.

Every day at work we print a report which runs anywhere from 7 to 25 pages. We print this every day of the year except for the yearly holidays.

Thirteen other departments or branches print their own versions of the reports; three or four will be as long or longer and the rest will most likely be shorter.

The reports are needlessly verbose in about a dozen different ways.

I can figure out how to get most of the problems corrected in Notepad++, using global replace with the normal, extended, or regex option as necessary. Unfortunately I have to do these replacements one by one each day, as I can't get Notepad++ to record any of the global replaces in a macro that actually works. Instead, it records a macro that, when you tell it to run, does exactly nothing.

The result is either a) a lot of wasted staff time or b) a lot of wasted paper.

Is there a text editor I could use to write a sequence of global replace commands that could be saved as a macro?

Ideally the result would be that John Q. Newbie doesn't have to do anything more than dump the report into the program and click a button to have it magically cleaned up.
posted by johnofjack to Computers & Internet (14 answers total)
 
Oh yeah: it would have to be a free (as in beer) text editor that works on Windows.
posted by johnofjack at 5:51 PM on March 15, 2010


Create a script using command line tools. perl -i.bak -pe 's/regex1/replace1/ig; s/regex2/replace2/ig;' filename.ext is all you need to apply as many regexes to filename.ext in place as you want (renaming the original as .bak). Perl is free and there are multiple distributions for Windows.
posted by Rhomboid at 6:36 PM on March 15, 2010


Oh this must just make the baby-larry-wall cry....

I think you'll have better luck digging down and learning a programming language. Perl (Larry Wall being the author) is probably a good choice as much of it's core approach is regex's.

Get perl, rewrite your existing regex's(likely trivial), try it on your existing reports, done.

But emacs can certainly do what you ask, and it's a text editor.

Looks like the docs for notepad++ is not great but there must be a way to load complex expressions. Good luck.
posted by sammyo at 6:48 PM on March 15, 2010


Agreed with sammyo that a real programming language may be the way to go. It sounds like your regexes are already good, and that the report is basically in plain text (if you're using n++ now), so you're setup for easy processing. If Rhomboid's comment makes sense to you, that's probably the easiest way to do it.

On the other hand, you want users to be able to apply whatever you make easily... In that case, you might want to stick with n++. If so, try getting help on their wiki or forum.
posted by heliostatic at 7:00 PM on March 15, 2010


If you are dead set on doing this in a text editor and you are willing to get in touch with your hardcore nerd side, you can absolutely get GNU Emacs to do what you want. It comes in Windows flavors and has nice menus for open/save/save as. What you would do is write an elisp function which performs whatever replacements you wanted (using replace-regexp etc.), and then set up a keybinding to run the function on the current buffer ("buffer" meaning "open file" in Emacs-land). You can then tell your users, "Click File->Open..., open the report, hit Ctrl-Alt-R [or whatever] to reformat it, and then click File->Save."

(If you're currently stuck with Notepad++, Emacs is worth getting familiar with whether you use it for this or not. It's the most powerful basic text editor, hands down. Don't let its crusties scare you!)

If you asked me as a software developer to do this in a more sustainable way, I would suggest finding someone to write you a little Java Swing app which would let you run regex replacements from a simple config file and then let you provide the app+config file to users. Modern Windows versions of Java will let you run a Java app by double-clicking it just like an .exe file. A top-notch developer could do a tight job of this in a week or so. This is a little more bespoke than the Emacs solution but I think you may find Emacs daunting in the long term despite my earnest recommendation.
posted by mindsound at 7:53 PM on March 15, 2010


Ugh, talk about overkill. Put the perl commands in in a .cmd or .bat file (with "%1" in place of filename.ext) and drop a shortcut to that in the SendTo directory under your profile dir. Then you can apply the regexes to any file in Explorer by simply right clicking and choosing the name of the shortcut under 'Send To...'
posted by Rhomboid at 8:17 PM on March 15, 2010


Replace Text should be able to do whatever Notepad++ can do to the text, and stores the procedures so you can repeat them precisely each time with a couple of mouse clicks.
posted by MesoFilter at 11:40 PM on March 15, 2010


What, no love for Sed?
posted by DarkForest at 6:23 AM on March 16, 2010


And now I'm embarrassed to admit that it's been so long since I've used PHP that I didn't even think of it for this application. But it does allow use of regex and I remember enough PHP that I think I'll just post a form online that takes text input with checkboxes on what to remove, then outputs the cleaned-up version for printing. No install, and probably simple enough that everyone can use it, and the pickier people can customize as they like.
posted by johnofjack at 8:12 AM on March 16, 2010


Thanks for the suggestions, everyone.
posted by johnofjack at 8:13 AM on March 16, 2010


cygwin and sed?
posted by bastionofsanity at 11:49 AM on March 16, 2010


I've been working on this a bit when there aren't more pressing work things at hand. I finally figured out the two hardest parts of it this morning and it's up now, streamlining and reformatting 99-page documents in less than a second.

This project forced me to brush up on my PHP, and I also learned about trim() and preg-replace which, if the execution time on those reports is any indication, is literally 15x as fast as ereg. And also I figured out backreferences! All told, a great project.

Thanks everyone.
posted by johnofjack at 1:10 PM on March 21, 2010


I ended up going with PHP, just because I already knew a bit of it, but I think sammyo had the best answer for power and flexibility and also the ease of use for others. A webform it was.
posted by johnofjack at 1:13 PM on March 21, 2010


It's done, though the code isn't as elegant as it could be. Still, when the various stops on the internet send it along quickly, it cleans up a big report in a very respectable time.

Soon it'll be moved onto the intranet (in case, FSM forbid, someone should accidentally paste in a report containing patron info--you can't have that going around the internet unencrypted).

This project was for a library catalog interface by a well-known vendor whose name slant rhymes with "Mercy Fine Yaks," and for that reason I thought the work might be of use to others (inelegant code and all). So I posted it for others to build on.

Thanks to Jessamyn for making me think of it--for posting an article some time back about libraries all over reproducing each others' work for various reasons.
posted by johnofjack at 3:01 PM on April 2, 2010


« Older Call all Database Gurus!   |   Not so smartphone Newer »
This thread is closed to new comments.