Lighttpd
上QQ阅读APP看书,第一时间看更新

Rewriting and Redirecting Requests

URLs are a part of the user interface of every website, be it a full-blown application or just a bunch of static pages. Users sometimes use URLs to navigate, say, by cutting a suffix to "move up a directory". So, we want to present them a clean structured tree. Unfortunately, reality is usually not that nice. We may have some web frameworks stitched together that require their own path names. We may want to hide from the user that she is calling a script. Whatever the reason be, mod_rewrite and mod_redirect are here to help us.

The difference between rewrite and redirect is that a rewrite happens directly in the server, while redirecting a request is done by sending a header to the user telling her where the page really is. This difference is important when deciding whether to rewrite or to redirect. If we have a kind of shortcut or a second domain name and want to direct the user to the "correct" URL, we redirect. Otherwise, we rewrite, for example to present a coherent URL tree to the user when in reality the tree is created by a parameterized CGI script or distributed across multiple directories.

Both modules share the same conventions. So we'll discuss them in one go. Before using mod_rewrite and mod_redirect, we need to tell Lighttpd to load them:

server.modules = ("mod_rewrite", "mod_redirect", ...)

Remember that the order of modules loaded is important—mod_rewrite and mod_redirect change the query URL, so they should come first. Otherwise, your request might be done with before the modules even had a chance to rewrite or redirect something.

mod_rewrite gives us the options url.rewrite-once and url.rewrite-repeat, mod_redirect provides the url.redirect option. It would make no sense to have url.redirect-repeat, as this would require keeping track of who was redirected to where. Also, if the user is redirected to himself or herself, an infinite loop would occur. All browsers I have tested guard against such infinite loops and present an error to the user. Naturally, we have to be careful with url.rewrite-repeat, as the rewrite happens inside the server and will stop only after a hundred iterations.

Each of these options takes a list of regex => URL pairs. The URL should be fully qualified, for example: "http://www.example.com/mysite/", not "/mysite/".

Remember captures from regular expressions? This is where they shine. In the URL, the expressions $1, $2, ..., $9 are expanded to the respective captures of the regex. As in:

url.rewrite = ("^/mysite/([^/]*)/(.*)" => "http://www.example.com/mysite.php?x=$1&y=$2")

If the user browses http://www.example.com/mysite/one/two, the rewritten URL will be http://www.example.com/mysite.php?x=one&y=two.

Or it would be even better, if we put our rewrite and redirect lists into a host selector and use its captures using %1, %2, ..., %9: for example, if we want to redirect every URL not starting with www to the same URL, but with "www." prefixed, we just add to our configuration:

$HTTP["host"] =~ "www\..*" {
Lighttpdrequests, rewriting# we can do something for the correct URL here.
} else $HTTP["host"] =~ "^(.*)$" {
# we use the else here so we can capture the whole hostname.
url.redirect = ( "^(.*)$" => "www.%1/$1" )
}

Now, any call to http://example.com will be redirected to http://www.example.com.

This is the beauty of Lighttpd—a set of simple, but extensible components cleverly integrated, so we do not have to learn more complicated syntax.

Of course, if you want to have a quote, dollar sign, percent sign, or backslash verbatim in your URL, you have to prefix it with a backslash to escape it. Percent signs are quite usual for URL-encoded characters (that is a percent sign followed by a hex code). For example:

url.rewrite = (
"^(.*)$" => "ourdomain.net/dispatch.cgi?test=\"\%5F\"&page=$1"
)

You could use this method to extend the flexibility of your virtual hosting method to any possible subdomain. The trick is to rewrite the host name to be part of the document path:

$HTTP["host"] =~ "^(.*)\.ourdomain.net$" {
url.rewrite = ( "^(.*)$" => "ourdomain.net/%1/$1" )
}

Add a directory for any subdomain you might want named exactly like the subdomain. In our example, a subdirectory named "somewhere" in the document root would be mapped to http://somewhere.ourdomain.net. This method has a small drawback: the error returned for undefined subdomains is a file-not-found instead of a server-not-found. A better method will be discussed in the next chapter.

Now, we can serve static pages even with virtual hosting. Our configuration file is likely to become a little bloated. Luckily, Lighttpd gives us some features to manage the complexity by including files, defining and using variables to give name values which are often used, including the output of an executable file (usually shell code) into the configuration.