htaccess syntax woes
September 18, 2006 2:10 PM   Subscribe

I need to add a few lines to my htaccess file to deal with the fact that Google Image Search never updates its database

One of the most common ways people find my blog is through Google Image Search. It has taken the keywords used to tag my images, as well as the pagerank of the site, and made me a prominent hit for many search terms. The trouble is, when I moved to WordPress, I broke all the links. They have not yet grown back.

Most of the hits are for archive pages for the old blog. For example, someone looking for photos of Oxford might be sent to:

http://www.sindark.com/archive/2006_01_01_sibilant_archive.html

Now, that page still exists. It is at:

http://www.sindark.com/blogger/archive/2006_01_01_sibilant_archive.html

Ideally, the person should be sent to the new post page in the WordPress architecture, but sending them to the new location of the Blogger archive page is better than dumping them into a 404 File Not Found screen.

How can I edit my htaccess file so that anyone looking for:

Xyear_Xmonth_Xday_sibilant_archive.html

will be sent to

/blogger/archive/Xyear_Xmonth_Xday_sibilant_archive.html

Of course, this is a temporary fix until Google finally understands the new architecture of my site.

A cleaner explanation of this is on my blog (SELF LINK!).
posted by sindark to Technology (16 answers total) 1 user marked this as a favorite
 
RewriteRule ^archive/([A-Za-z0-9-]+) blogger/archive/$1

try that on for size...

for reference, there is a wonderful mod_rewrite cheat sheet at ILoveJackDaniels.com
posted by hatsix at 2:27 PM on September 18, 2006


Response by poster: That doesn't seem to work. Trying to go to:

http://www.sindark.com/archive/2005_12_01_sibilant_archive.html

still leads to a 404 screen.
posted by sindark at 2:42 PM on September 18, 2006


Uh, does the RewriteRule go in htaccess or the apache conf file?
posted by RustyBrooks at 3:03 PM on September 18, 2006


in the .htaccess...

seems like you have it working now! (at least, it works for me!)
posted by hatsix at 3:29 PM on September 18, 2006


p.s. the reason the URLs above don't work is because of the date... you didn't post anything on Jan 01...

The url you posted in the comment:
http://www.sindark.com/archive/2005_12_01_sibilant_archive.html

works fine for me... the other URLS don't work

p.s. there should already be a rule in the htaccess for pretty URLs... you should be able to dupe this to get the rules you need.
posted by hatsix at 3:34 PM on September 18, 2006


Response by poster: hatsix,

I reverted to an old version that only fixes a single photo - the one of Tallinn at night. For all others, it is broken.
posted by sindark at 3:34 PM on September 18, 2006


yeah, scratch my comments about it working...
posted by hatsix at 3:38 PM on September 18, 2006


Response by poster: For the reference of anyone looking around, the htaccess file is set now to the place it was at when I posted this question.
posted by sindark at 3:47 PM on September 18, 2006


Try this in your .htaccess file

RewriteEngine on
RewriteRule ^archive/([A-Za-z0-9_.]+) blogger/archive/$1

if you've already got a line with RewriteEngine on, you don't need to have it again.
posted by cheaily at 4:14 PM on September 18, 2006


A couple of additions that couldn't hurt:

RewriteEngine on
RewriteBase /
RewriteRule ^archive/([A-Za-z0-9]+)\.html blogger/archive/$1\.html [NC,L]

(the [NC,L] should be on the same line as the rewrite rule).

If it doesn't work, you'll want to look at your apache error logs to see where it is going.*

Another option if you *don't* want to use mod_rewrite, is to soft-link /archives to blogger/archives and then make sure apache follows symlinks by putting 'Options +FollowSymlinks' in the .htaccess.

* I pulled the _. and added the .html because I wasn't sure exactly how that was supposed to escape in the [A-Za-z0-9_.]) Anybody want to elaborate how that syntax is supposed to work?
posted by fishfucker at 4:42 PM on September 18, 2006


Response by poster: "Anybody want to elaborate how that syntax is supposed to work? "

It sure isn't intuitive, is it?
posted by sindark at 5:23 PM on September 18, 2006


let me know how you make out with this...

Wordpress's new system of doing rewriting internally instead of writing the rules to the .htaccess file directly has been giving me fits when trying to make custom rules for old content (similiar situation to yours).
posted by rampy at 7:19 AM on September 19, 2006


Best answer: Try using a permanent redirect rather than just rewriting. How about something like this as the first rewrite rule in .htaccess?

RewriteRule ^archive/([0-9]{4}_[0-9]{2}_[0-9]{2}_[a-z0-9_-]+\.html)$ /blogger/archive/$1 [R=301,L]

(I've not checked or tested that so there may be mistakes)
posted by malevolent at 8:33 AM on September 19, 2006


* I pulled the _. and added the .html because I wasn't sure exactly how that was supposed to escape in the [A-Za-z0-9_.]) Anybody want to elaborate how that syntax is supposed to work?

inside a character class block (ie, []), the full-stop loses it's status as a metacharacter, and simply matches a full-stop.
posted by cheaily at 8:15 PM on September 19, 2006


oh. thx.
posted by fishfucker at 8:31 PM on September 19, 2006


Response by poster: malevolent,

Thanks a lot. I added that to my htaccess file, and it seems to be working.

I will post here again if I discover some strange and terrible error.
posted by sindark at 1:56 PM on September 24, 2006


« Older Squeezing out.... some answers!   |   eye crud in NYC Newer »
This thread is closed to new comments.