Regular Expressions? PHP? Existing App?
August 29, 2005 1:58 PM   Subscribe

I have a bunch of directories of static HTML files in which I need to: -> Find A -> Find B -> Replace A with B -> Save, close, repeat with every one of the 500 or so files. Anyone got an idea?

I've written some PHP functions that will do the find and replace bit but I'm having a bitch of a time figuring out how to run the script across my whole file system.
posted by TiggleTaggleTiger to Computers & Internet (21 answers total)
 
I don't know if you have access to perl, but I always do this with perl from the command line:

perl -p -i -e 's/A/B/g' *.html

I'm sure this is obnoxiously terse, but if you can use perl, this is all you need.
posted by xueexueg at 3:18 PM on August 29, 2005


Dreamweaver can do find/replace in an entire site or directory.
posted by Jase at 3:19 PM on August 29, 2005


On Windows you could do this with TextPad or HomeSite. On a Mac, try BBEdit or TextWrangler.
posted by hyperizer at 3:33 PM on August 29, 2005


Even easer, and completely free... download the 45-day evaluation of the awesome UltraEdit text editor here. This program has a "Replace in files" function that will do exactly what you want. That's if you're on a PC -- if you're on a Mac, BBEdit has the same capabilities, I believe.
posted by killdevil at 3:33 PM on August 29, 2005


Even easier than that, and more sustainably free is jEdit.
posted by elderling at 3:42 PM on August 29, 2005


what hiperizer said...

there is an excellent find/replace tool in textwrangler. It will traverse a whole directory structure and save you a ridiculous amount of time.
posted by freq at 3:43 PM on August 29, 2005


I'd also use perl, but finding B is a bit harder than a regex search&replace. It shouldn't be too hard to make a 2 or 3 line perl script (I never like remembering perl commandline opts, but I'm sure its possible on the commandline).
posted by devilsbrigade at 3:51 PM on August 29, 2005


Response by poster: No. It isn't a simple search and replace. It is a bit more complicated than that. I have to find the contents of an <h4> and use them to replace the contents of a <title> tag.

So, say the document looks like this:

<title>Standard Document Title</title>
Blah, blah, blah...
<h4>Unique H4 tag contents</h4>


It would then need to look like this:

<title>Unique H4 tag contents</title>
Blah, blah, blah...
<h4>Unique H4 tag contents</h4>

posted by TiggleTaggleTiger at 3:52 PM on August 29, 2005


Response by poster: LOL... I guess the live preview isn't as accurate as I thought. Anyway, you get the gist of it, right?

And I don't think I have access to Perl. Or a Unix box. I've got Apache Tomcat and Apache running on my Windows XP box so I could do it in PHP or JSP or use some free-standing app.
posted by TiggleTaggleTiger at 3:54 PM on August 29, 2005


Response by poster: What devilsbrigade said. Sadly, my knowledge of Perl amounts to slightly more than my knowledge of alchemy. Not a whole lot...
posted by TiggleTaggleTiger at 3:56 PM on August 29, 2005


install activeperl.
use ref for all of this (I don't have time to write out the specifics & make sure it works)

open directory.
foreach file in directory:

open/read file.
/[h4]\(.*)[/h4]/
s/[title].*[/title]/[title]$1[/title]/
write file.


next file, or recurse for directory (there may be a more efficient way to do things - I tend to go for the brute force, obvious approach)

Change square brackets to angle brackets, mefi's code/pre is broken, and escape what you need to in the regexes. Using .* may also not be the best idea, you might want to explore a bit.
posted by devilsbrigade at 4:11 PM on August 29, 2005


There's a class in the comments of the readdir funtion on php.net that will read through an entire directory structure.

Instead of using it to print out the file location, modify the class to pull the file into a string, perform the regular expression, and then write the file back out if anything had to be replaced.
posted by alana at 5:01 PM on August 29, 2005


I'm quite rubbish at regex, but this sounds like something I could easily do using Dreamweaver.

There's a great Introduction to Regular Expressions in Dreamweaver up at Macromedia which covers finding and replacing content within html tags.
posted by bruceyeah at 5:48 PM on August 29, 2005


Best answer: definitely perl; you can get it for Windows. If you need help writing the perl code, I'm sure there are a few of us that can help.

Also, put it in a script. Much easier to comprehend than code on the command line, I find.
posted by polyglot at 6:02 PM on August 29, 2005


Best answer: Well since you've already got PHP and you're a familiar with it:

$my_dir = "your/directory/";
$output_dir = "where/you/wantthe/changed/ones/"

while (false !== ($document = readdir($my_dir)))
{
if(is_file($document)){
$opened = fopen($my_dir.$document,'r');
$read = fread($opened,filesize($my_dir.$document));

// do your replacin' here the output call the output $out_string

$out_opened = fopen($out_dir.$document,'W+');
fwrite($out_opened,$out_string)

fclose($opened);
fclose($out_opened);

}
}

There's nicer ways to do it, but that's pretty safe. Make sure the out put directory exists before you run it. From the sounds of the question, you know how to do the preg_replace part, but if you need help, drop a line
posted by miniape at 6:51 PM on August 29, 2005


Best answer: You can still do that with a simple RE and a single perl command.

perl -i.bak -pe 'BEGIN { undef $/; } s!<title>Standard Document Title</title>(.*?)<h4>(Unique H4 tag contents)</h4>!<title>$2</title>$1<h4>$2</h4>!sig' *.html

-i.bak creates backup files with extension .bak. Use just -i if not desired.

undef $/ causes the RE to work on the entire file at once, instead of line mode. /sig in the RE flags implements this.
posted by Rhomboid at 8:53 PM on August 29, 2005


And obviously, you would have to tailor the RE to your specifics, but since you didn't give those specifics, it's kind of hard to reply. The point is that you can still do this with a RE and without writing all the "traverse every file and change its contents" code -- perl and the shell already have that covered.
posted by Rhomboid at 8:55 PM on August 29, 2005


The thing with php is you need everything to be happening on a server type environment, by default at least. Perl is better for this sort of thing, cause you can just run it anywhere, but as long as all the files you're working with are in php-land, then using the above script and hitting it in a browser will do the trick.
posted by 31d1 at 11:17 PM on August 29, 2005


31d1, that's flat out wrong. PHP can be run on the command line just like perl.
posted by alana at 12:22 AM on August 30, 2005


Well, I think it depends on if you have it installed locally, or are interacting with a remote server and editing files locally.

I wasn't trying to say it couldn't work that way, it's more that it's more or less the way it seems to be used in general.
posted by 31d1 at 6:28 AM on August 30, 2005


Well, I think it depends on if you have it installed locally, or are interacting with a remote server and editing files locally.

I wasn't trying to say it couldn't work that way, it's more that it's more or less the way it seems to be used in general.
posted by 31d1 at 6:28 AM PST on August 30 [!]


Yeah but, without either Perl OR PHP installed locally, you're SOL. Bottom line is that both can do it locall, provided you've got them installed.
posted by jikel_morten at 12:49 PM on August 30, 2005


« Older Landline phone network connectivity   |   Could someone explain the Box Model Hack? Newer »
This thread is closed to new comments.