Apache 2 Advanced Configuration on Unix-Like Systems

In a previous tutorial, we took a look at some of the most basic, but important, Apache configuration directives - what they are for, and how to edit them to fit our needs. For a very basic website (perhaps one with just a few static HTML pages), those simple directives might be all you need to know. Chances are, however, you need a more complex website; today we will look at some advanced directives and configuration settings.

Apache's behavior is controlled through settings and directives applied in plain-text configuration files that usually end with the extension ".conf".

The most important .conf file is httpd.conf, although that depends on your particular installation and Linux distribution (for example, it might be called apache2.conf instead). These configuration files generally apply to the server as a whole, even if the server is hosting multiple websites.

Per-directory configuration files are called ".htaccess" files by default, and are located within the web server's public document directory. These allow certain directives to be applied to certain directories and their sub-directories, rather than the entire server and all its hosted websites. Though not advised, you can change the name of the .htaccess file to something else using the AccessFileName directive, which can be set in httpd.conf:

Please note that htaccess is not an extension; it is the file's name. On UNIX-like operating systems, the dot (.) preceding a file name signifies that the file is hidden.

Considering their location in the file system, not all directives can be applied in .htaccess files, as some might simply not be valid. Every line in a configuration file must begin with one of the following:

  • a #, indicating a comment
  • a valid directive
  • a space
  • an empty line

If a line does not begin with one of the above options, Apache will issue an error message instead of starting its HTTP service. So, it is important for you to ensure that the server configuration is valid. If a directive spans over multiple lines, end each line with a back slash (\) before proceeding to the next directive.


Maintaining Flexibility

The most important .conf file is httpd.conf.

While making your own changes to the default configuration files, it is best to break those changes out into external files (with meaningful names, of course), and include them in the main configuration file, using the include directive. If separating your settings from the server's default settings is not possible, at least make a habit to comment out old settings before introducing new ones. This will make it easier to roll back to any file version if the need arises, and it also allows you to upgrade to a newer version of the httpd.conf file with minimum hassle. All you need to do is copy the new version in place of the old, and re-insert your includes at the end of the new file (or at least re-apply your old changes after commenting out the default ones in the new file). For example:

As you can see from the example above, you can include a specific file by name, a directory (and all files and sub-directories therein), or multiple files by using wildcards.


Advanced Setup

In my previous tutorial on Apache, you learned some basic directives that control Apache's behavior. In this tutorial, we'll look at a few advanced directives, starting with <Directory>.

The <Directory> Directive

The <Directory> directive allows you to specify settings and directives to apply to directories and sub-directories. This gives you the ability to do all sorts of things, such as limiting access to certain directories and files, and turning on or off certain options to certain directories, among other things.

The <Directory> tags take a path and enclose a block of options to be applied to that directory and its sub-directories. Here is an example:

The opening <Directory> tag in this example specifies a path of /, which means the root directory and all its sub-directories and files. The settings defined inside the <Directory> tags apply to the / path (essentially everything in the root).

Note that htaccess is not an extension; it is the file's name.

The Options directive declares which server features are valid for the specified directory. In this example, no options are valid for the / path. But you could specify any number of options, like allowing symbolic links, allowing the execution of CGI scripts, allowing server-side includes, and many more.

The AllowOverride directive tells the server which settings declared within the <Directory> tags to override if they are reset in the .htaccess file. When the server finds an .htaccess file, it needs to know which directives declared in that file can override earlier configuration directives. If AllowOverride is set to None, as shown in the example above, then no directives are overridden and any re-declarations in the .htaccess files are completely ignored.

If the AllowOverride directive is set to All, then any directive re-declared in the .htaccess file will override earlier configuration directives. AllowOverride can take specific directive types that can be overridden while the rest will be ignored.

Deny and Allow control access to the directory specified within the opening tag, and are prioritized through the Order directive.

In the above example, Order Deny, Allow means all connecting hosts or IP addresses are denied access to the root directory, except those declared as good hosts. Order Allow, Deny on the other hand would mean that all hosts and IPs are to be allowed access to the root directory except those declared as bad or black-listed.

The Deny from all setting declares that access needs to be denied from all hosts. Since it is not followed by a whitelist, no hosts or IPs have access to the root directory (and this is how it should be for security reasons). For demonstration purposes ONLY, the below example demonstrates how to deny access from all hosts except www.goodhost1.com and www.goodhost2.com:

Note that the Order directive defines the precedence of the rules. So, first, we deny access from all hosts, and then we only allow access to both www.goodhost1.com and www.goodhost2.com. Alternatively, you can specify the two hosts on a separate line like this:

If you want to allow access to all sub-domains of the host goodhost.com (eg: sub1.goodhost.com, sub2.goodhost.com, and sub3.goodhost.com), you can specify a partial domain name to grant access to instead of listing all sub-domains to be allowed. The following example shows you how:

Additionally, if you want to deny access from all except a particular IP address on the local network, the following example illustrates how you could do that:

If you wanted to allow access to all hosts except for a few bad hosts, you could do something like this:

The above example opens the public directory to all connecting clients except for two bad bots and one spam host. Note the use of either the whole domain name or the partial domain name in the host blacklist.

Similar to <Directory> is <DirectoryMatch>. It encloses a group of directives that apply only to the named directory and its sub-directories. Instead of specifying a path, it takes a regular expression as an argument.

Indexes and DirectoryIndex Directives

When you visit a website, you often just type in the domain name without the specifying a page (eg: www.example.com as opposed to www.example.com/index.html). The reason being because the server usually always gives you the default page.

This functionality is governed by the DirectoryIndex directive on the web server; it tells the server the default page to respond with if no file is specified in the URL.

DirectoryIndex can take multiple values (ie: more than one file name), so that when the server encounters a request that doesn't specify a particular file, the server will go through the many values one by one until it finds a file with the same name in the requested directory.

If a client asks for a directory that doesn't contain any of the default files listed by the DirectoryIndex directive, the server responds with the directory listing, a list of all files and sub-directories contained within that folder. This could potentially be a security risk, as the files and file system structure of the requested directory are exposed. You can avoid this behavior by using the Indexes option to prevent directory listing on the DocumentRoot level (which is the root directory of the server's htdocs or public files). If you wish to display folder listings for any sub-directory under the DocumentRoot, you can turn on the Indexes option for that particular folder. The following listing demonstrates this:

The above example turns off directory listings under the main public HTML folder (in this case /usr/local/apache2/htdocs/) and all sub-directories, and then turns it back on only for the /usr/local/apache2/htdocs/sub-dir1/ sub-directory.

The <Files> Directive

While the <Directory> directive specifies the permissions or restrictions to be applied to a specific directory, the <Files> directive controls the restrictions and permissions for one or multiple files (wildcards must be used within a file name to specify multiple files). Take a look at this listing:

This example uses the <Files> directive to prevent .htaccess files from being viewed by web clients. The image below is a screenshot of what the server responds if you try to retrieve the .htaccess file by requesting something like http://www.example.com/.htaccess.

Similarly, the <FilesMatch> directive limits the scope of the enclosed directives by file name, just as the <Files> directive does. However, it accepts a regular expression as an argument.

The <Location> Directive

The <Location> directive works similar to the <Directory> directive except that it takes a URL as an argument, as opposed to a path to a local physical directory in the file system. This means that <Location> can be used to control content outside the server.

It is highly advised that you don't use the <Location> directive with local file system locations, because many URLs may map to the same file.

The <LocationMatch> directive also limits the scope of the enclosed directives by URL. Like the other "Match" directives, it accepts a regular expression as an argument.

The <Limit> and <LimitExcept> Directives

The <Limit> directive controls which HTTP methods (eg: GET, POST, etc) are allowed. As seen in the example below, the <Limit> directive denies the use of the POST, PUT, and DELETE methods to all client requests, except for those originating from 50.57.77.153.

The <LimitExcept> directive provides the opposite functionality. It still controls what HTTP methods are allowed, but it does so in an exclusionary manner. In the example below, <LimitExcept> denies access to all client requests using any HTTP methods other than GET and POST.


Configuration Structure and Directive Precedence

It is highly advised that you don't use the <Location> directive with local file system locations.

Apache's power comes from the ability to extend the server's capabilities by modules written by other programmers. As such, directives may be set either in the Apache core code or in installed modules. In order to prioritize a directive's effect and scope, Apache divides its configuration structure into three levels:

  • Server-level configuration
  • Container directives
  • Per-directory configuration

Server-level configuration includes the default directives set for the server as a whole. These directives can then be overridden by per-directory configuration files (.htaccess files), or within container directives (such as the <Directory> and <Files> tags).

Per-directory files are commonly located within (or can be added to) the public directory file structure; the contents of which are available to sub-administrators and developers, and thus, who have the ability to mess with the server's configuration by adding all kinds directives to various parts of the server. Therefore, the server administrator has the ability to control which directives can be applied within those files, and which directives can or cannot override the default server configuration.


Conclusion

This article was intended for server administrators to provide a reference for more advanced Apache settings and configuration options. As we learned, you can apply these directives either on the server level or a per-directory level. Depending on how and where you set these directives, your server will combine these settings into a unified final configuration.

Tags:

Comments

Related Articles