What pitfalls should I keep in mind when adapting PHP scripts to run on the command line?
March 24, 2010 12:49 PM   Subscribe

I'm trying to run some PHP scripts via the CLI instead of over HTTP. How do I make them play nice?

This is essentially a follow-up to this question from a while back. I'm using scripts from FeedForAll to join together RSS feeds (RSSmesh) and display them as HTML (RSS2HTML).

Because I intend to run these scripts fairly intensively and don't want the resulting HTTP requests and bandwidth to count towards my hosting quota, I am in the process of moving to running them on the web host's server in an umbrella PHP "batch" script, and call this script via cron (this is a Linux server, by the way).

Here's a (working) sample request over HTTP:

http://www.mydomain.com/a/rss2htmlcore/rss2html2.php?XMLFILE=http://www.mydomain.com/a/myapp/xmlcache/feed.xml&TEMPLATE=template.html

This will produce the desired HTML output. An example of how I want this to work on the command line:

/srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php /srv/customers/mycustomer#/mydomain.com/www/a/myapp/xmlcache/feed.xml /srv/customers/mycustomer#/mydomain.com/www/a/template.html

This is with the correct shebang line added to "rss2html2-cli.php". I could just as well specify the executable ("/usr/local/bin/php") in the request, I doubt it makes a difference because I am able to run another script (that I wrote myself) either way without problems.

Now, RSS2HTML and RSSmesh are different in that, for starters, they include secondary files -- for example, both include an XML parser script -- and I suspect that this is where I am getting a bit in over my head.

Right now I'm calling exec() from the "umbrella" batch script, like so:

exec("/srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php /srv/customers/mycustomer#/mydomain.com/www/a/myapp/xmlcache/feed.xml /srv/customers/mycustomer#/mydomain.com/www/a/template.html", $output)

But no output is being produced. What's the best way to go about this and what "gotchas" should I keep in mind? Is exec() the right way to approach this? It works fine for the other (simple) script but that writes its own output. For this I want to get the output and write it to a file from within the umbrella script if possible. I've also tried output buffering but to no avail.

Do I need to pay attention to anything specific with regard to the includes? Right now they're specified in the scripts as include_once("FeedForAll_XMLParser.inc.php"); and the specified files are indeed in the same folder.

Further info:

-This is a Linux server
-I have no direct access to the shell, so I can't test things directly on a command line, everything is via crontab
-If you read the previous question, I opted for a home server the last time but I want to avoid this now because I don't want to be dependent on my ISP's uptime
-Last time it was suggested I move to a hosting provider with larger or unlimited bandwidth quotas. I'd like to avoid this for now as I am an otherwise satisfied customer (plus the data centres are near where I am, in the Netherlands).
-I will admit that support for the FeedForAll scripts leaves a lot to be desired, but I'd like to keep using their scripts if at all possible, if only because I know them and have been using them for a while. I have looked into Simplepie, but the FFA scripts do some things that I've seen no obvious solutions for with Simplepie, like limiting the number of items per individual feed (RSSmesh) or limiting the description length (RSS2HTML).
-Yahoo! Pipes is out, they cache their data for too long for my application.

-------------

Should you want to take a look at the code, here are the scripts as txt files:

RSS2HTML
RSSmesh
The included parser

Note that I have not yet amended these to handle $argv etc. I've dabbled with this but thought it better to show the original script. I have adapted this one however:

One of my own scripts that works fine with CLI

-------------

If anyone has any thoughts to share on this it would be very much appreciated. Thank you in advance.
posted by goodnewsfortheinsane to Computers & Internet (4 answers total)
 
Best answer: I can't quite follow how you've got everything pieced together, especially how the "umbrella" batch script gets called from the crontab, but... assuming it runs the same way you'd run it from the command line, you probably want "php /srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php..." instead of "exec /srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php..." Using "php" instead of "exec" says "hey linux, run ss2html2-cli.php as a php script", whereas "exec" says "hey linux, use bash to run ss2html2-cli.php"

And, yeah, you're gonna have to modify the script to use $argv and pray that none of the included pieces don't assume this is an http request.
posted by and hosted from Uranus at 1:06 PM on March 24, 2010


Best answer: Actually you want:
exec("php /srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php...");

The other thing that's probably tripping you up is the includes. Make all of the includes use full paths instead of relative paths. Depending on how stuff is configured, they may not be aware of any path information at all.
posted by signalnine at 2:04 PM on March 24, 2010


Best answer: I know you don't want to regularly run this from home, but I'd try debugging this from home. It's easier if you can see the generated errors.
posted by Pronoiac at 3:54 PM on March 24, 2010


Response by poster: Turns out it was a variety of issues the details of which are too mundane to go into. But you all pointed me in the right direction.

Thanks so much for your time.
posted by goodnewsfortheinsane at 6:58 PM on March 25, 2010


« Older Damn, that cat is crazy!   |   Name-dropping former employee in cover letter? Newer »
This thread is closed to new comments.