Sound notification for Wordpress hits?
January 3, 2011 4:24 PM   Subscribe

Realtime sound notification of Wordpress blog hits?

Hi, I have a new Wordpress blog, hosted on my domain. To keep me motivated, I'd love to make it go ding! whenever there's a hit. Is there a plugin that does this? If not, would it be possible to jury-rig something with say Applescript or Javascript? I'm on a Mac. It doesn't have to be real realtime, it could query it say once every 10 minutes and ding the appropriate number of times. Doable? Thanks in advance!
posted by Tom-B to Computers & Internet (17 answers total) 10 users marked this as a favorite
 
I think you'll find that even on a site nobody reads, search engine indexers visit very regularly. Your buzzer will be going off non-stop.

The implementation of this involves a command along the lines of "tail -f /var/log/apache2/access.log | perl -ne '`say "A visit!"`'".
posted by jrockway at 5:25 PM on January 3, 2011


That has the capability to go really really badly the day you get what the kids used to call 'slashdotted', as jrockway says. The day you're playing a 0.5-second sound for every hit, and getting a hit every 0.4 seconds is going to be a long day.

Perhaps a better implementation would just be a sound every ten minutes if there have been any hits, not a sound per hit?

Also, you want this happening all day and all night?

It's entirely possible, but it definitely needs more thought. Where exactly is your blog hosted?
posted by AmbroseChapel at 7:12 PM on January 3, 2011


You could probably build such a concoction with Sitemeter and some really crazy scripting. (Or, you know, you could spend all those hours actually working on your blog so people want to visit it.) What you do want instead is... CHARTBEAT. Oh man. It's like crack. It actually can go ding—via your phone, as it can send text messages or push notifications to you when all kinds of thresholds are met.
posted by RJ Reynolds at 7:24 PM on January 3, 2011


Best answer: If you have ssh access to the server that hosts the blog you can absolutely do this in real time by adapting jrockway's example to operate over ssh, i.e. you are tailing the file on the remote server and seeing the output locally. However, you want to use tail -F and not -f since you want to continue following access.log when it's cycled. You also want "-n 0" so that you don't get 10 beeps the first time you run it. You can filter out search engine spiders by their User-Agent strings. Google's has the string "Googlebot/n.n" in it and Bing "bingbot/n.n" in it, so combining all of that you have something like

ssh user@host 'tail -n 0 -F /path/to/access.log | grep -v -E -i "(google|bing)bot/[0-9]"' | while read; do say "A visit!"; done

Set up passwordless (public key) auth and you can launch this without having to type the password every time.
posted by Rhomboid at 8:31 PM on January 3, 2011 [2 favorites]


Response by poster: > I think you'll find that even on a site nobody reads, search engine indexers visit very regularly. Your buzzer will be going off non-stop.

That'd be ok, knowing that my content is being indexed would be motivating too.

> The implementation of this involves a command along the lines of "tail -f /var/log/apache2/access.log | perl -ne '`say "A visit!"`'".

That's totally over my head, where do I write this? The Terminal? What should I read/study to understand this?

That has the capability to go really really badly the day you get what the kids used to call 'slashdotted', as jrockway says. The day you're playing a 0.5-second sound for every hit, and getting a hit every 0.4 seconds is going to be a long day.

Yeah, maybe an if clause somewhere? Say if hits > 100 play a different sound?

>Also, you want this happening all day and all night?

Yeah, why not? I'd set the volume really low, or assign some smooth sound file, it'd be almost like listening to my blog breathe.

>It's entirely possible, but it definitely needs more thought. Where exactly is your blog hosted?

http://bluehost.com
The blog is http://toro.tv.br/blog/

>You could probably build such a concoction with Sitemeter and some really crazy scripting.

Ok, what kind of scripting, where? I'm willing to learn, but what should I study?

> If you have ssh access to the server that hosts the blog you can absolutely do this in real time by adapting jrockway's example to operate over ssh, i.e. you are tailing the file on the remote server and seeing the output locally. However, you want to use tail -F and not -f since you want to continue following access.log when it's cycled. You also want "-n 0" so that you don't get 10 beeps the first time you run it. You can filter out search engine spiders by their User-Agent strings. Google's has the string "Googlebot/n.n" in it and Bing "bingbot/n.n" in it, so combining all of that you have something like

ssh user@host 'tail -n 0 -F /path/to/access.log | grep -v -E -i "(google|bing)bot/[0-9]"' | while read; do say "A visit!"; done

Awesome, but way too technical for me, what should I learn to get this????
posted by Tom-B at 11:49 AM on January 4, 2011


Best answer: Hmm. Well Bluehost's splash page says they support SSH (aka Secure Shell) but digging into their knowledgebase you have to send them a copy of a government ID to enable that feature. Assuming you're willing to do that, you just need to find out three items. 'user' in the above is your account username, probably the one you use to log in to cpanel. 'host' is most likely toro.tv.br. Before trying anything else you should try verifying that ssh works by typing just 'ssh [whatever]@toro.tv.br' at a terminal. If it's working you should probably see some message asking you to type "yes" to confirm that the host you're trying to connect to is who it says, and then a password prompt. If that is accepted then you should then be at a prompt that says "user@host: ~$ " or something along those lines. Press ctrl-d to log off and quit.

Now you need to know the location of your access log, which is '/path/to/access.log' in the above example. The cPanel default would be '/usr/local/apache/domlogs/toro.tv.br', so that's what I'd try first. I also recommend testing this in parts. You've already verified that ssh itself works, next try

ssh user@toro.tv.br 'tail -n 0 -F /usr/local/apache/domlogs/toro.tv.br'

Let it sit for a while if it appears that nothing is happening. It should print a log line to your screen in near-realtime for every hit to the site. If that works you can move on to the noise making, if it doesn't then you'll need to figure out which part is wrong, most likely the path to the log file.

If you don't want to send them a copy of a government ID to enable ssh, then you can still easily accomplish this but it would probably involve writing a few lines of PHP and a cron job and some more gymnastics which would require some coding experience unfortunately.
posted by Rhomboid at 12:34 PM on January 4, 2011


Best answer: Also, I meant to add that this method would result in one 'event' (noise/ding/whatever) for every hit, where hit means any HTTP resource. Generally a page load of a blog contains a number of assets: the page itself, some images, some stylesheets, some JS, etc. Each of those is a 'hit' from the standpoint of the access log, so you will most likely want to do some filtering so that only a hit for the 'page itself' triggers an action. You might accomplish that by adding one or more grep clauses along the lines of

ssh user@toro.tv.br 'tail -n 0 -F /usr/local/apache/domlogs/toro.tv.br | grep -v -E -i -e "(google|bing)bot/[0-9]" -e "/wp-content/"' | while read; do say "A visit!"; done

That will exclude search engine spiders and anything containing the string "/wp-content/" which should catch all the extraneous page assets. Also, the 'say "A visit!"' part can be any command, so for example you might want

... | while read; do afplay /System/Library/Sounds/Glass.aiff; done

(I think afplay is only available in 10.5 and higher, but there are many other ways of making a sound from the command line if you google.)
posted by Rhomboid at 1:05 PM on January 4, 2011


Best answer: Oh, and press ctrl-c to stop it.
posted by Rhomboid at 1:12 PM on January 4, 2011


Response by poster: Rhomboid, that is awesome! I am sending ID right now.

But also scary! Why do they need ID, is SSH that powerful?

What should I know to use it responsibly? Any basic dos and don'ts?

Thanks!
posted by Tom-B at 8:43 AM on January 5, 2011


Best answer: The host wants ID because ssh access is associated with abusive behavior, such as installing and running your own daemon on a random high port (which would bypass the normal http and ftp bandwidth logging, so it could be e.g. used to share warez), or running vulnerability scanners and otherwise trying to hack the host or one of the other accounts on the host if it's a shared host. This kind of behavior in turn is associated with using stolen credit cards, so I guess they figure that an ID with a name that corresponds to the name on the account is enough of a deterrent.

However, in theory ssh access does not really let you do anything you couldn't already do by writing a PHP/perl/python/ruby script or by tweaking options in cPanel. If you wanted to run arbitrary commands you could just write a PHP script that runs those commands and returns the result (and in fact there are numerous PHP scripts that wrap this and give you the equivalent of a shell without actual shell access), or you could enter a cron job in cPanel to run the commands as a one time scheduled job that runs one minute from now. Some hosts try to plug these holes by running PHP in safe mode or by disabling functions like system() and friends, but I don't know what they think they're going to do about perl/python/ruby which all have system()-like functions that can run arbitrary commands, or for that matter a CGI script with an arbitrary command as the hashbang. A much better approach would not try to futilely limit things at this level but would instead use OS/kernel-level restrictions like forbidding accept() or regularly killing long-running processes. However, shared web host security is kind of cess pit of bad ideas and cargo cult thinking and so these half-measures continue, because the correct ones don't work with cPanel, or they break some scripts, or they cause higher CPU load and worse performance.

By the way, I forgot that in some circumstances OS X users have tcsh as their default shell instead of a Bourne shell, and C shell has no read command so it won't like '| while read; ...". If you are in that situation you would want to use jrockway's version of

... | perl -ne '` command-goes-here `'

Note that's singlequote backtick command backtick singlequote. It could also be written as perl -ne 'qx(command)' or using any other kind of matching brace if the command itself contains parens, or any single convenient punctuation char that's not in the command, however if the command contains a single quote you will require some quoting gymnastics. Alternatively, since you probably don't want to type/paste this every time you could put the command in a text file and mark it executable (chmod +x filename) and then execute that instead. In that case you can make the first line of the file #!/bin/sh and not worry about tcsh incompatibility.
posted by Rhomboid at 12:33 PM on January 5, 2011


Response by poster: AWESOME! It's working! Thanks!!!!

I sent ID, enabled ssh, found the access logs, tailed them and told it to play a sound whenever they're acessed. This is what I'm using:

ssh user@toro.tv.br 'tail -n 0 -F path/to/logs' | while read; do afplay /System/Library/Sounds/Pop.aiff; done

So we're getting like 10 or 12 pops every time a page is accessed. Progress!

But I added the grep part and It's making no sound whatsoever:

ssh datoroco@toro.tv.br 'tail -n 0 -F access-logs/toro-tv-br.datoro.com | grep -v -E -i -e "(google|bing)bot/[0-9]" -e "/wp-content/"' | while read; do afplay /System/Library/Sounds/Pop.aiff; done

Now I understand grep, so I guess it needs some tweaking. Also, I read up and the concept of pipes | makes sense to me. But how do I make conditionals? Say if the hits are from my IP play a different sound, if they're from Google play a different one. If the request is for our portfolio page, play a ka-ching sound :-) and so on.

So what are these commands I'm writing? Are they PHP?
posted by Tom-B at 5:10 PM on January 6, 2011


Best answer: But I added the grep part and It's making no sound whatsoever

Most likely it's working, but just on a delay. That's due to IO buffering. By default many programs detect when their output is going to a file or pipe and not a terminal/pty and instead save up their output and only send it when enough has accumulated. This is a speed optimization because it means fewer system calls for the same amount of data, but it results in you not hearing anything until, say, 20 hits have occurred and then you hear all 20 pops at once. The reason it worked before adding the grep is that tail by default does not buffer its output while grep does. The solution is to disable that and tell grep to output lines as they come in, with the --line-buffered option: grep -v -E -i --line-buffered -e "...".

But how do I make conditionals?

There's a lot of ways to skin that cat. You're probably not going to be able to do it very easily with grep, so one option would be to use the shell's built in pattern matching with a 'case' statement. This also starts to get a lot longer than what you'd ordinarily put on a single line so you'd probably want to put the commands in a file, like this. Save that as a plain text file and then invoke it either with "bash filename" or make it executable (chmod +x filename) and then invoke it as "./filename" if it's in the current working directory or "path/to/filename" if it's elsewhere.

Note that these shell pattern matching things are not the same as regular expressions that you use with grep. They're less powerful, but more simple. Some of the differences:
  • In shell pattern matching ? means any one character, * means any number of characters. In regexp, * and ? are modifiers and can't be used by themselves, and . means any one character and .* means any number of characters.
  • Each case statement must match the whole line, so if you want to match a line containing 'foo' anywhere you need "*foo*".
  • Case is sensitive.
  • Multiple patterns can be combined with | and the [] operators work for ranges, but most of the other regexp features won't work.
When tweaking this it's also a good idea to see the text you're trying to match against. You can view the log files directly, or just run the ssh command with just the tail and without the grep or 'while read' parts to spit them out raw as they come in. If you're curious about the fields, google for the Apache combined log format, but in summary it's basically all the information about the request on one line, including IP, the type and path of the request, the response status code and length, the referer, and the User-Agent. Note however that everything after the # in a URL is local to the browser and is not sent as part of the request. So when fetching a URL like http://example.com/#foobar, the server will log a request that says only "GET /". I noticed that your portfolio page is set up to work this way so you might have some trouble distinguishing it from a request for the root page in the log file. There could be a workaround for that -- for example if your portfolio page calls for a specific page asset that's not used anywhere else on the site you could trigger on that to have a distinctive sound in that case.

But that's just one possible skeleton of a solution. Here's an example that uses perl. It's the same basic idea but in a different language. And here you're back to using regexps if you prefer. (Note also that I used the given/when case syntax that was added in perl 5.10, so if your version of perl is 5.8 that script won't run. I think OS X 10.6 was the first version to ship with perl 5.10 but you can certainly install it on older versions, or the script could be modified to use the old style perl switch statements.)

So what are these commands I'm writing? Are they PHP?

No, not PHP. I'd describe it as shell scripting. Shell scripting both is and isn't its own language. Essentially what you're doing is piecing together various disparate programs (ssh, tail, grep, etc), where each command has its own entire universe of options and syntax, so it's not really a language. But at the same time it is a language because the shell itself in addition to letting you piece together programs has its own syntax for making case statements, loops, variables, etc. which is what the "while ...; do ... done" and "case ... in ... esac" stuff is.
posted by Rhomboid at 9:05 PM on January 6, 2011


Response by poster: Amazing!!!! it's working ! :-)

So I added --line-buffered to grep, works flawlessly. I'm also grepping for /wp-content/ and my IP, so I got one pop per hit, but not if it's from myself.

ssh datoroco@toro.tv.br 'tail -n 0 -F access-logs/toro-tv-br.datoro.com' | grep -v -E -i --line-buffered -e "/wp-content/" -e "187.37.78.239" | while read; do afplay /System/Library/Sounds/Pop.aiff; done

I'll leave the conditionals for later, but now I know what to study, thanks! Also, I'm still getting multiple pops on pages that are not managed by wordpress, but I think I'll be able to hack something.

THANKS!!!!! Really amazing!!!!
posted by Tom-B at 4:18 PM on January 7, 2011


Response by poster: ok, so I'm playing around with grep and reading up a bit on shell scripting and conditionals, everything's working great.

BUT!

From time to time it just stops. Once it said "Read from remote host toro.tv.br: Connection reset by peer", the other time it just quit by itself and went back to the prompt.

Why does this happen and how do I prevent it???
posted by Tom-B at 8:40 AM on January 8, 2011


"Connection reset by peer" means that the server disconnected you. Usually because of inactivity.

You could keep it running by just wrapping the whole thing in an infinite loop... if it stops for any reason, it just starts up again. If you're running it in a bash script, something like this:

while true; do
your command
done

Infinite loops can be dangerous, as they can run out of control... there isn't any stop condition. You have to manually kill them. But it sounds like that's what you're after.
posted by team lowkey at 1:03 PM on January 8, 2011


Ah, that might be the host automatically terminating long-running processes. As I alluded earlier, they tend to associate that with abusive behavior. But what you're doing here isn't abusive, and the tail and ssh process hanging around on their server are taking up next to zero resources so I wouldn't feel too bad about it. What you can do to fix it is just have the script automatically reconnect when it's disconnected. There are tools like autossh that can handle this for you, but you could also just take your current command and wrap it in another loop, i.e.

while :; do [entire command]; done

This will also require public key auth so that you don't have to be there each time to type a password, but that's probably a good thing. If you haven't set that up, you can do it by running ssh-keygen and following the prompts; most likely you can just hit enter to accept the defaults at each prompt. At some point it will ask you if you want to use a password for the key. If you have ssh-agent set up it's possible to have it remember this password for you, but it's probably easier just to say no. This doesn't affect the security of your site, it only has to do with what happens if someone were to steal your private key from your local computer.

After ssh-keygen has finished you should have a '.ssh' directory under your home directory, and inside that directory two files named 'id_rsa' and 'id_rsa.pub'. The one ending in .pub is the public half of your key, the other is the private half. The way this works is that the private half is meant to be kept secret at all times, but the public half can be shown or given to anyone without compromising security. This property means that you can upload your public key to any server or service that you might want to authenticate with, without worrying about someone else seeing it and being able to use it to access your systems. So copy the contents of the id_rsa.pub file (either type 'cat ~/.ssh/id_rsa.pub' and then copy it from the terminal, or open it a text editor) and then go to your site and edit the file '.ssh/authorized_keys' and paste the contents of that key there. Create the .ssh directory under your home directory if it doesn't exist yet. There's probably an option in cPanel to automate this if you prefer.

After you've got that setup the ssh command should no longer ask you to type a password each time and so it can reconnect when disconnected without you having to be present.
posted by Rhomboid at 1:14 PM on January 8, 2011


Couple of tiny nits in Rhomboid's sh script example: when you're using a construction like
generate-some-lines | \
while read line
do
     # use $line for something
done
you don't actually need the trailing \ on the end of a line that ends with a pipe | because the shell doesn't distinguish newlines from other whitespace between a | and the token that follows it.

Also, you probably want to include the -r option in the read command unless you have a specific reason not to. Without this, read will interpret any \ character inside what's being read as an escape, and eat it. For example, this
echo 'foo\ bar baz qux' |
{
    read f1 f2 f3
    echo "f1: $f1"
    echo "f2: $f2"
    echo "f3: $f3"
}
produces
f1: foo bar
f2: baz
f3: qux
while this
echo 'foo\ bar baz qux' |
{
    read -r f1 f2 f3
    echo "f1: $f1"
    echo "f2: $f2"
    echo "f3: $f3"
}
produces
f1: foo\
f2: bar
f3: baz qux

posted by flabdablet at 7:43 PM on January 8, 2011


« Older Why are my ovaries trying to make me cheat?   |   Wooing For Work Newer »
This thread is closed to new comments.