Using RedirectMatch with URLs containing question marks.
December 31, 2005 11:26 AM   Subscribe

RedirectMatch, mod_rewrite, and question marks, oh my! Please help me redirect to urls with question marks in them.

I've got a web site hosted on an apache server, I'm not a Unix guy, and I need some syntax help.

I would like to redirect a list of urls to a list of other urls. I want to have a bunch of RedirectMatch statements in my .htaccess file. Unfortunately, the new urls have question marks in them.

What I want is something that would work like this (for about 200 different urls):
RedirectMatch permanent ^/archives/000127.html$ http://www.example.com/?p=129
Googling has taught me that RedirectMatch chokes on the question marks because it interprets it like a regex (or something like that). This thread makes me think that I can do something with RewriteCond, RewriteRule or the like. I don't quite understand how to implement the solution mentioned in the above mentioned thread:
RewriteCond %{QUERY_STRING} ^topic=(.*)$
RewriteRule ^index\.php$ /topic/%1? [R=301,L]
Can one of you Apache experts provide a snippet of what my .htaccess file would have to include. I don't know if the RewriteRule and RewriteCond lines need to happen only once, before the list of RedirectMatch lines, or if it has to occur before each one. If you could give me an example that would redirect two different urls to two new ones, I could then implement it.

(I apparently cannot use Redirect, I need to use RedirectMatch)
posted by i love cheese to Computers & Internet (9 answers total)
 
Put a \ before any character in a regular expression you want to be treated as a literal character, including \\ for a literal \.
posted by boaz at 11:41 AM on December 31, 2005


I would do this as follows, not sure if it is the best way to do it.

RedirectMatch 301 ^/archives/(.*).html$ /?p=$1 [L]
posted by riffola at 11:43 AM on December 31, 2005


Also, put something in parentheses in order to use $1 as a backreference to it later, ala
RedirectMatch permanent ^/archives/0*(\d+).html$ http://www.example.com/?p=$1
On preview: or what riffola said.
posted by boaz at 11:44 AM on December 31, 2005


Response by poster: On reviewing my question, I realize I didn't make one point clear. The new url doesn't necessarily have the same number as the old URL (eg, archives/000127.html points to ?p=129).

I have a list of the matching numbers and will be pasting each line into the .htaccess file. So I can't use a variable (which is what I assume the $1 is). What would the line look like with two constants (000127.html and 129) instead?

Thanks!
posted by i love cheese at 12:07 PM on December 31, 2005


Response by poster: Also, it appears that using a backslash doesn't make it a literal question mark.

That is, redirecting to
example.com/?p=127
ends up at
example.com/%3fp=127

and redirecting to
example.com/\?p=127
ends up at
example.com/%5c%3fp=127.

I think this is why others have thought about using rewrites.

Argh!
posted by i love cheese at 12:26 PM on December 31, 2005


RewriteMap archive-id  txt:/path/to/file/id-map.txtRewriteRule ^/archives/(\d+)\.html$ /?p=${archive-id:$1} [R=301,L]

posted by sbutler at 2:24 PM on December 31, 2005


Response by poster: sbutler's solution looks like it will work. For some reason I get a 500 error when I add it to my .htaccess file (with it pointing to a valid id-map file).

I don't get an error from the line "RewriteEngine On," so I assume my server supports it.

I'm about ready to throw in the towel. Any more advice would be greatly appreciated, otherwise I'll just replace all the old files with new ones containing META REFRESH tags.

Thanks
posted by i love cheese at 7:10 PM on December 31, 2005


What does your error_log say? What happens if you add these lines:

RewriteLog "/path/to/file/rewrite.log"
RewriteLogLevel 8

(after "RewriteEngine on" and before the first Rewrite directive)
posted by sbutler at 1:30 AM on January 1, 2006


Best answer: Ohh... I didn't see that it was in an .htaccess file.

All the rewrite rules work diferently inside .htaccess or a section. Basically, it's because the path translation phase has already occured. The rewrite docs are filled with caveats for this situation (from the introduction):

Unbelievably mod_rewrite provides URL manipulations in per-directory context, i.e., within .htaccess files, although these are reached a very long time after the URLs have been translated to filenames. It has to be this way because .htaccess files live in the filesystem, so processing has already reached this stage. In other words: According to the API phases at this time it is too late for any URL manipulations. To overcome this chicken and egg problem mod_rewrite uses a trick: When you manipulate a URL/filename in per-directory context mod_rewrite first rewrites the filename back to its corresponding URL (which is usually impossible, but see the RewriteBase directive below for the trick to achieve this) and then initiates a new internal sub-request with the new URL. This restarts processing of the API phases.

Again mod_rewrite tries hard to make this complicated step totally transparent to the user, but you should remember here: While URL manipulations in per-server context are really fast and efficient, per-directory rewrites are slow and inefficient due to this chicken and egg problem. But on the other hand this is the only way mod_rewrite can provide (locally restricted) URL manipulations to the average user.


Basically, I think you need something like this:
RewriteEngine onRewriteBase /askmeRewriteRule ^archives/(\d+)\.html$ ?p=${askme-id:$1|0} [R=301,L]
The RewriteMap directive must be in the server config. It can't exist inside a Directory directive or an .htaccess file. I'd strongly suggest you make RewriteMap work, otherwise you'd be doing a couple hundred pattern matches per request.

posted by sbutler at 1:50 AM on January 1, 2006


« Older Kindly answer my question about the word "kindly"   |   Quick Layout Export? Newer »
This thread is closed to new comments.