Who hijacked my Google cache?
May 23, 2007 9:48 PM   Subscribe

How was my Google cache hacked? I have a web page that is fine, but the Google cache of the same page is a "spam" page with keywords linking to other web sites, random text and images, but also some valid links on my web site. How did that happen?

This is a web page for a domain I have hosted by TotalChoice Hosting.

I was checking out the stats for the domain and discovered some unexpected search terms ("penis enlargment") led to the web page. The web page is 10 lines of simple hand-coded HTML (a title and two links and no mention of penises). Bringing up the web page and looking at its source indicates that the page has not been compromised.

I did a Google search with some unusual keywords that my statistics indicated led to my web site and my web page was the only search result. But the search result includes text that has never been on my page. Clicking the regular link to my page brings it up as expected. Clicking the link to the Google cache brings up the spam page (cached in February).

TotalChoice Hosting says they haven't been compromised recently (well, at least the support guy I chatted with said so).

It's not a big deal. There's nothing important on the web site. But does anyone have a clue on what happened?
posted by ShooBoo to Computers & Internet (6 answers total) 1 user marked this as a favorite
Your hosting company could have configured their web server to serve the ad-filled page to the Google spider, although I'm not sure why they would do that.
posted by panic at 10:25 PM on May 23, 2007

You say the cache is from Feb, and the hosting co hasn't been compromised "lately". The obvious question I'd want answered is what was happening last winter, in the weeks before that cache was made.

I wouldn't focus too much on Google showing your site as result for words that aren't on your page. That's how PageRank is meant to work, and what makes things like googlebombing possible. So there could be an explanation as simple as another site linked to yours using those terms. If you want, use Advanced Search to search for sites linking to yours.

But if Google cache actually shows your site having mystery content, Occam's Razor does suggest that the site was compromised and defaced at least briefly back then. A good hosting co catches that stuff fast though a better company prevents it from happening in the first place, grumble grumble...
and restores your site from a backup a more transparent, responsible company would have also kept you informed of such things.... What does archive.org show for the URL in question? They stay 6 months behind, so you won't be able to see it's Feb condition yet but it'd be interesting to see what it shows for Dec and beyond.
posted by nakedcodemonkey at 10:36 PM on May 23, 2007

see it's its Feb condition

grr. argh.
posted by nakedcodemonkey at 10:38 PM on May 23, 2007

If you know how to do this... try spoofing your UserAgent to look like googlebot and see if you're getting the spammy page. There are exploits out there for some CMSs where, if the CMS is improperly confugred, bad guys can muck with your files and add spam cloaking like this. See, e.g., this thread from the wordpress codex.

I would guess that this isn't a circumstance of a site getting defaced at exactly the right time. This seems like it'd be a pretty straight-forward problem to solve, but we'd need a little more info.
posted by toomuchpete at 10:46 PM on May 23, 2007

Sounds like something was compromised at some point (either the sites themselves or the DNS). It's worth pretending to be a search engine spider though in case there's still something dodgy going on (e.g. some kind of proxy on the server modifying the content).

Have you checked the Wayback Machine in case it caught a change?
posted by malevolent at 11:41 PM on May 23, 2007

another possibility
posted by nakedcodemonkey at 11:53 PM on May 23, 2007

« Older I have the summer free. I want to lose weight.   |   So what's the deal with the creepy anti-Arab... Newer »
This thread is closed to new comments.