Complex htaccess question for Blogger to WordPress migration
May 23, 2006 11:29 AM   Subscribe

I recently shifted the content management for my blog (www.sindark.com) from Blogger to WordPress. I shifted the Blogger version into a sub-directory, so that people who prefer to access old posts in that format can. Now, I want to use htaccess to automatically redirect people who try to access the Blogger archives in the old location. I want them sent to the Blogger files in the new location.

The old static HTML Blogger files were kept in three directories:
www.sindark.com/2005/
www.sindark.com/2006/
www.sindark.com/archive/

The new equivalents are:
www.sindark.com/blogger/2005/
www.sindark.com/blogger/2006/
www.sindark.com/blogger/archive/

The big problem is this: WordPress also uses www.sindark.com/2005/ and www.sindark.com/2006/ as directories, though it doesn't actually put any files in them. Because of that, the code I tried below doesn't work. It does redirect people properly for the Blogger pages, but it breaks WordPress:

Redirect 301 /archive http://www.sindark.com/blogger/archive/
Redirect 301 /2005 http://www.sindark.com/blogger/2005/
Redirect 301 /2006 http://www.sindark.com/blogger/2005/

Since all the Blogger files are .html and all the WordPress files are .php, there may be some way to redirect requests for HTML pages in those directory to the new Blogger area, while leaving .php requests alone.

Right now, there are two copies of all Blogger content: one at the new location and a residual copy in the old location. Having it there doesn't bother WordPress, but I would really prefer to have all the Blogger stuff off in its own section: hence the need for this redirection.

If anyone knows how to code that, or can suggest a better approach, I would be most appreciative. The major reason I want to do this is that the old Blogger pages have all been listed on Google and get frequent hits through there. I don't want people to start getting 404 errors instead.

This issue is also being discussed at: http://www.sindark.com/2006/05/21/bug-reports-thread/
posted by sindark to Computers & Internet (4 answers total)
 
If mod_rewrite is installed, you can use it to do as specific of a redirection as you want. I'm not sure I'm following exactly what it is you're after, but as an example, your .htaccess in the 2005 directory would look something like this:
RewriteEngine on
RewriteRule ^(*)\.html$ http://www.sindark.com/blogger/2005/$1\.html [R]
This is untested/probably error ridden/etc, but should get you on the right track. What it is supposed to do is match any URL that ends with .html, and rewrite the URL to /blogger/2005/whatever_was_matched.html . The [R] should then force them to refresh the page.
posted by team lowkey at 12:58 PM on May 23, 2006


Response by poster: I tried adding the following lines to my htaccess file:

RewriteRule /2005/^(*)\.html$ http://www.sindark.com/blogger/2005/$1\.html [R]
RewriteRule /2006/^(*)\.html$ http://www.sindark.com/blogger/2006/$1\.html [R]
RewriteRule /archive/^(*)\.html$ http://www.sindark.com/blogger/archive/$1\.html [R]


And I get the following error whenever I try to access the front page of the site:

Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, support@supportwebsite.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.

Apache/1.3.33 Server at www.sindark.com Port 80


Any idea what's wrong?
posted by sindark at 10:05 AM on May 29, 2006


The thing I notice immediately is that the caret should be at the begining of the rule. In RewriteRule speak, ^ means start of line, and is just used to make sure you're not matching anything deeper in the directory structure, like your other 2005 directory. Also, make sure that you have "RewriteEngine on" before the rule in the .htaccess file. And you might as well throw a FollowSymlinks in there, in case it isn't enabled in the apache config. If you have access to the log file, it should tell you specifically what is failing.

I would also suggest you put a separate .htaccess file in each directory, with a single rule for that directory. That way you're not processing every url request and possibly rewriting/breaking things on the rest of the site, only in those specific locations. You could add some RewriteCond's to make sure it only matches what you want it to, but it's simpler to just physically separate the rules into the appropriate directories. And definitely easier to debug. Try this putting this in /2005/.htaccess :
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^*\.html$ http://www.sindark.com/blogger/2005/$1\.html
..and see if you can get just that one directory working first.
posted by team lowkey at 12:13 PM on May 30, 2006


Response by poster: @team lowkey,

After trying for quite a while to figure this out, I've just deleted the files in the original locations and written an error 404 page that should help people find what they were looking for.

Thanks for the tips, everyone. That htaccess syntax is very confusing.
posted by sindark at 10:33 AM on June 4, 2006


« Older A new home for an unappealing cat?   |   How do I find work as a gaming journalist? Newer »
This thread is closed to new comments.