Let's start with an example from an imaginary configuration file...
<VirtualHost *:80>
CustomLog /var/logs/httpd/vaccess.log vcommon
UseCanonicalName Off
VirtualDocumentRoot /var/www/html/%-2/
</VirtualHost>
You'll first notice the "UseCanonicalName Off" core directive. This is mandatory for our purposes as it tells Apache to use the host name as requested by the client rather than a value set in a ServerName directive or devising one if it's absent. You'll also notice that all our sites' web documents must be in a sub-directory of /var/www/html .
More important are the strange %-2 in the paths. This is a vhost instruction that allows us to extract a part of the host name and use it in the rewriting of the document path. The parts of the host name are determined by the '.' it contains. Thus, www.fekore.com has 3 parts. '%-2' means "extract the second to last part of the host name". Again using our example, suppose we have a request for "http://www.femore.com/hello.htm"...Since fekore.com resolves to some ip, Apache would use our VirtualHost declaration and would replace '%-2' with 'fekore', as the latter is the before last part of the host name. The document to return would thus be found at: /var/www/html/fekore/hello.htm.
While the above example would serve you well for the dot-com and dot-net type domains, it would not fare well for country domains. Thus, a request for http://abc.co.in and http://xyz.co.uk would both translate to a path of /var/www/html/co/, which is obviously not what we intended. There's actually a whole slew of interpolation meta characters that we can use, so we're not stuck. Here are some of them, using http://www.test.fekore.com as an example.
%0 : use the whole name [www.test.fekore.com]
%1 : use the first part [www]
%2 : use the second part [test]
%3 : use the third part [fekore]
%-1 : use the last part [com]
%-3 : use the third to last part [test]
%2+ : use the second and all subsequent parts [test.fekore.com]
%3+ : use the third and all subsequent parts [fekore.com]
...etc...
You can also go nuts by extracting a part from a host name, then extracting a part from that part. In the latter case, the part would be a character or sequence of characters. We do this by using the format '%N.M', where %N is our first extract and 'M' is the second. The '.' is mandatory, but you omit the '%' in the second. For example...
VirtualDocumentRoot /var/www/hrml/%-2.2/
...if we put the url http://www.test.fekore.com through this directive, we get /var/www/html/e/. That's because %-2 extracted fekore, and the '2' gave us the second letter 'e'. I never needed this capability, but maybe you'll find some use for it. If what truly interests you is extreme url rewriting, there's another module which can allow you to slice and dice urls any which way, called "rewrite_module".
In order to smoothly rewrite paths for our fekore.co.in type host names that our first example fumbled with, we need to assign them a different IP in a separate VirtualHost declaration. Our vhost directives in that one would look like this...
VirtualDocumentRoot /var/www/html/%-3/
Thus, we'd extract the fekore rather than the 'co'. I believe in keeping things simple, so I avoid using anything but the %- interpolations, as the rightmost part of any host name is reliable whereas the leftmost isn't. One drawback to using vhost_mod is that you can't have individual log files for each site sharing the IP. Instead, all will log to the same files. The way we compensate for that is by creating a special logging definition that will include a field that registers the domain of the request. Thus, for example, the instruction...
LogFormat "%V %h %l %u %t "%r" %s %b" vcommon
Appears somewhere in our configuration file. The "%V" tells apache to log the host name requested by the client [See: mod_log_config].
With the LogFormat we've just defined, your log might look something like this...
www.rtzxfgh.com 172.128.55.43 - -
[03/May/2001:10:21:48 -0400]
"GET /bogus.htm HTTP/1.1" 200 0
www.ouwqagxh.com 172.128.55.44 - -
[03/May/2001:10:21:49 -0400]
"GET / HTTP/1.1" 200 0
In order to generate individual traffic reports for each site sharing the log file is a statistical tool that can parse such a log.
Some useful links you can follow to get further knowledge on the subjects discussed today:
mod_vhost_alias
UseCanonicalName
mod_log_config
<VirtualHost *:80>
CustomLog /var/logs/httpd/vaccess.log vcommon
UseCanonicalName Off
VirtualDocumentRoot /var/www/html/%-2/
</VirtualHost>
You'll first notice the "UseCanonicalName Off" core directive. This is mandatory for our purposes as it tells Apache to use the host name as requested by the client rather than a value set in a ServerName directive or devising one if it's absent. You'll also notice that all our sites' web documents must be in a sub-directory of /var/www/html .
More important are the strange %-2 in the paths. This is a vhost instruction that allows us to extract a part of the host name and use it in the rewriting of the document path. The parts of the host name are determined by the '.' it contains. Thus, www.fekore.com has 3 parts. '%-2' means "extract the second to last part of the host name". Again using our example, suppose we have a request for "http://www.femore.com/hello.htm"...Since fekore.com resolves to some ip, Apache would use our VirtualHost declaration and would replace '%-2' with 'fekore', as the latter is the before last part of the host name. The document to return would thus be found at: /var/www/html/fekore/hello.htm.
While the above example would serve you well for the dot-com and dot-net type domains, it would not fare well for country domains. Thus, a request for http://abc.co.in and http://xyz.co.uk would both translate to a path of /var/www/html/co/, which is obviously not what we intended. There's actually a whole slew of interpolation meta characters that we can use, so we're not stuck. Here are some of them, using http://www.test.fekore.com as an example.
%0 : use the whole name [www.test.fekore.com]
%1 : use the first part [www]
%2 : use the second part [test]
%3 : use the third part [fekore]
%-1 : use the last part [com]
%-3 : use the third to last part [test]
%2+ : use the second and all subsequent parts [test.fekore.com]
%3+ : use the third and all subsequent parts [fekore.com]
...etc...
You can also go nuts by extracting a part from a host name, then extracting a part from that part. In the latter case, the part would be a character or sequence of characters. We do this by using the format '%N.M', where %N is our first extract and 'M' is the second. The '.' is mandatory, but you omit the '%' in the second. For example...
VirtualDocumentRoot /var/www/hrml/%-2.2/
...if we put the url http://www.test.fekore.com through this directive, we get /var/www/html/e/. That's because %-2 extracted fekore, and the '2' gave us the second letter 'e'. I never needed this capability, but maybe you'll find some use for it. If what truly interests you is extreme url rewriting, there's another module which can allow you to slice and dice urls any which way, called "rewrite_module".
In order to smoothly rewrite paths for our fekore.co.in type host names that our first example fumbled with, we need to assign them a different IP in a separate VirtualHost declaration. Our vhost directives in that one would look like this...
VirtualDocumentRoot /var/www/html/%-3/
Thus, we'd extract the fekore rather than the 'co'. I believe in keeping things simple, so I avoid using anything but the %- interpolations, as the rightmost part of any host name is reliable whereas the leftmost isn't. One drawback to using vhost_mod is that you can't have individual log files for each site sharing the IP. Instead, all will log to the same files. The way we compensate for that is by creating a special logging definition that will include a field that registers the domain of the request. Thus, for example, the instruction...
LogFormat "%V %h %l %u %t "%r" %s %b" vcommon
Appears somewhere in our configuration file. The "%V" tells apache to log the host name requested by the client [See: mod_log_config].
With the LogFormat we've just defined, your log might look something like this...
www.rtzxfgh.com 172.128.55.43 - -
[03/May/2001:10:21:48 -0400]
"GET /bogus.htm HTTP/1.1" 200 0
www.ouwqagxh.com 172.128.55.44 - -
[03/May/2001:10:21:49 -0400]
"GET / HTTP/1.1" 200 0
In order to generate individual traffic reports for each site sharing the log file is a statistical tool that can parse such a log.
Some useful links you can follow to get further knowledge on the subjects discussed today:
mod_vhost_alias
UseCanonicalName
mod_log_config
Comments