Friday, February 27, 2009

Subdomain Redirection Using htaccess And mod_rewrite On Apache For Linux Or Unix

Hey there,

Following up on yesterdays post on 301 redirects, which we realize is a tired subject, today we're going to take look at simple ways you can do regular/subdomain redirection using htaccess and "mod_rewrite" for Apache on Linux or Unix. And, while yesterday's subject matter (and perhaps even today's) may be somewhat generic, we're trying to even the balance here. Now that we're in the 500's on post count, we feel it would be a shame if somebody came here to learn something really cool and then couldn't find some other really basic information. As written previously, if this blog's subject matter were more limited this would be much easier to do. On the bright side, we'll probably never run out of things to write about :)

You can reference the basics of mod_rewrite syntax and rules on the link provided that points back to apache.org. While this page (or pages it links to within the same site), obviously, covers all the aspects of using Apache's "mod_rewrite," the style may not be suited to everyone. It's great information, but assumes some level of comfort that you may or may not already have regarding the subject matter.

In our few examples today (which won't cover much but should be explained with a good amount of detail), we're going to use a file named ".htaccess" (the dot at the beginning of the name is important!) placed in our web server's root folder (or base folder, etc) which contains the "mod_rewrite" rules. Our first example will implement the equivalent of a 3xx redirect using "mod_rewrite" instead of the more conventional methods. In order to implement this, just place the following content in your .htaccess file (assuming, again, that we want visitors to www.myxyz.com to be redirected to www.myotherxyz.com):

Options +FollowSymLinks
RewriteEngine On
RewriteBase /
Redirect 301 / http://www.myotherxyz.com


Looks familiar, yeah? We actually covered that yesterday, as it's one of the most basic rewrites you can do. We just added the "standard recommended" starting lines for your .htaccess file. Those top three lines aren't "always" necessary, which is why we didn't include them in yesterday's example.

Here's another example that shows you how to use "mod_rewrite"'s regular expression functionality to redirect from an htm(l) page on one web server to a page on another web server:

Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteRule ^(.+)\.htm$ http://www.myotherxyz.com/$1.html [R,NC]


This example doesn't do all that much more than the previous, but it "is" a lot more flexible. We'll walk through the RewriteRule, to see what it's doing (and, remember, all of the nitty-gritty can be found on Apache's mod_rewrite page). The line itself, if you're familiar with basic regular-expression syntax, isn't too hard to decipher. The characters that are special in the line are the following:

^ <-- The caret indicates that the match is anchoring to the start of the request.
. <-- The dot character is representative of "any" alphanumeric character (including spaces, tabs, goofy characters, etc). It matches "anything"
\ <-- The backslash character "escapes" special match characters. In this case, we're using it to represent a "real" dot, rather than the regular-expression dot described above.
+ <-- The plus symbol indicates that the character preceding it must exist one or more times if the match is to be considered a success
$ <-- The dollar sign is the opposite of the caret. It indicates that the match is anchored at the end of the request
() <-- Anything within parentheses (on the left-hand-side, or LHS, of the expression) is considered to be a "captured" part of the match. It can be referenced on the right hand side (or RHS) of the rewrite rule using the dollar sign. This sign, as shown above, means something completely different on the RHS of the equation.
$1, $2, $3, etc <-- the dollar-sign-plus-number on the RHS of the expression represents the content of an LHS match (within the parentheses, as written above). The first parentheses match on the LHS is represented by $1, the second by $2, etc.
[] <-- Within the match rule itself, the brackets indicate a range, or "class" (a term that doesn't seem correct, but is used in the mod_rewrite terminology) of characters. For instance [abcd] would match the letter a, b, c or d. At the end of the mod_rewrite rule (as its final component) the characters within it hold special significance and "do not" represent a range. You can use more than one and just separate them with commas. In the example above they say that:
R <-- We're doing a "R"edirect. You can add an equals sign and the proper number if you want to be specific. So, a 301 redirect would look like [R=301] and make our expression's end look like [R=301,NC]
NC <-- Even though N and C both have special meanings of their own (in the final part of the rewrite rule), when used together as a single entity they mean that the matching-rule should be "case insensitive" NC equal "No Case" if that helps ;)

And that's that one line. Explained with only 10 or 12 more ;) Basically, it matches any web server request (for an html page - a bit more than our other rule was looking for) and redirects it, like so:

ORIGINAL REQUEST: http://www.myxyz.com/iNdex.html (Sent to the web server as "/iNdex.html)
MATCHING RULES APPLIED: ^(/iNdex).html ==> http://myotherxyz.com//index.html

The double slashes are squashed into one by default, but you could remove them from the LHS of the match rule and make that "" "^(.+)\.htm$" into "^/(.+)\.htm$" - effectively removing the leading forward slash from the () parentheses match and its extrapolated value ($1) on the RHS.

The standard final argument indicators are listed on Apache's mod_rewrite page and below:

[R] Redirect. You can add an =301 or =302, etc, to change the type.
[F] Forces the url to be forbidden. 403 header
[G] Forces the url to be gone. 401 header
[L] Last rule. (You should use this on all your rules that don't link together)
[N] Next round. Rerun the rules again from the start
[C] Chains a rewrite rule together with the next rule.
[T] use T=MIME-type to force the file to be a mime type
[NS] Use if no sub request is requested
[NC] Makes the rule case insensitive
[QSA] Query String Append. Use to add to an existing query string
[NE] Turns off the normal escapes that are the default in the rewrite rule
[PT] Pass through to the handler (together with mod alias)
[S] Skip the next rule. S=2 skips the next 2 rules, etc
[E] E=var sets an enviroment variable that can be called by other rules


In our final example, if you want to redirect all sub-domains of a domain to another site (e.g. redirect home.myxyz.com to www.myxyz.com), you could do it like this:

Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteRule %{HTTP_HOST} ^([^.]+)\.myxyz\.com$ http://www.myotherxyz.com [NC,L]


The %{HTTP_HOST} variable is translated from the variable provided by the standard HTTP headers. The L in the final argument means that, if this match is made, it will be considered the last match and no more rules in your .htaccess file will be processed. In our small example this doesn't really make a difference, but it can help with flow control if you have multiple rules and want to quit matching if you match this condition.

Hopefully today's quick run-down has been helpful. Let us know - too much information, not enough, not-enough-of-some-but-too-much-of-another? We're always interested in hearing your thoughts :)

Cheers,

, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.