Web servers and document roots.
In this post I’ll cover what web servers are what they do, a bit about how they work, and what document roots are.
We’re not going to go in to anything too technical here, the purpose of this post is to again a conceptual understanding of how your site is served by a web server and the relationship between web servers and document roots.
What a web server is.
A web server is a server (usually publicly available on the internet) that contains websites, and serves these upon requests that have originated from clients (usually web browsers).
DNS is what causes the request for a website to reach the correct server. You can read more about what DNS and how it works here.
Web servers will typically “listen” for incoming requests for sites on port 80 (unencrypted http) and 443 (encrypted https).
When an incoming request is received by a web server for a site, the web server serves the site back to the client making the request, and the client receives the site content (usually a mixture of HTML, CSS and Javascript) which it then renders in to the website seen in the browser (client). I’ll go in to a bit more depth covering how this works in a moment.
Click here if you’d rather skip to the “how it works” rather than reading about different types of web servers.
There are a few different types of web server, but they all serve the same purpose and do roughly the same thing (as outlined in the paragraph above), to mention a few different types of web servers:
- Apache is one of the most popular and widely used web servers. It’s open-source and runs on multiple platforms, including Linux, Windows, and macOS. Apache supports various features, modules, and configurations, making it highly flexible and customisable.
- Nginx (say this like “engine x”) is a lightweight and high-performance web server. It’s known for its efficient handling of concurrent connections and its ability to serve static content quickly. Nginx is often used as a reverse proxy or load balancer in front of other web servers to improve performance.
- Internet Information Services (IIS) is a web server developed by Microsoft and is primarily used on Windows servers. It integrates well with other Microsoft technologies and supports various programming languages such as ASP.NET and PHP.
- LiteSpeed Web Server is a high-performance web server designed to be a drop-in replacement for Apache. It offers better performance, scalability, and security features. LiteSpeed is compatible with Apache configurations, which makes migration easier.
There are other web servers that are a bit more provider specific, such as Microsoft Azure Web Apps which provides a managed environment for hosting web applications without the need to manage the underlying server infrastructure. Theres’s also Google Cloud Platform (GCP) App Engine which is a Google based equivalent. There’s also more custom web servers like Node.js which allows you to run JavaScript on the server-side, making it popular for building scalable and real-time web applications.
How a web server establishes which site to serve.
It’s often the case that there will be multiple sites held on the same server. So how does the web server know which site to serve?
The incoming request tells the web server which site is being requested, the web server then looks up the site in one of it’s configuration files, and in this file, the directory (or folder) location of each site is defined. The site files are contained in these defined directories.
The directory containing the site is called a document root or a web root. For example’s sake, Apache has a file called httpd.conf that contains a list of sites and their document roots. A VERY basic example of this would look like:
<VirtualHost *:80>
ServerName example.com
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
<Directory /var/www/html>
Options Indexes FollowSymLinks
AllowOverride None
Require all granted
</Directory>
</VirtualHost>
The part that tells the web server which directory to look in for the site “example.com” is:
ServerName example.com
DocumentRoot /var/www/html
So if the web server was a human and was asked for the example.com site, it would go “I’ve just received a request for example.com, I serve what’s in /var/www/html for that site, here you go, have the site I found in /var/www/html.”
A domain is effectively “mapped” to a document root, and the site files exist in the document root. This is effectively the relationship between web servers and document roots. The web server works out the document root of the site from the above, then serves the site held in the document root upon request.
There are other things going defined for example.com in the above as well, for example:
ErrorLog ${APACHE_LOG_DIR}/error.log = This is where the web server is told to log errors.
Options Indexes FollowSymLinks = Allows directory listing (listing files in a directory if there’s no default site file, and also allows the following of symbolic links.
The key point here is that the web server is being told where the site is, what’s allowed, and where to log.
What this means to you.
As someone operating a website, what this means to you, is that you have to put your site files in the document root of your domain for them to be served.
If you have a single site cPanel hosting account by default, the document root is the public_html directory which you can see in the cPanel file manager.
If you’re using a single site Plesk hosting account, by default the document root is httpdocs, so document roots can differ according to the type of hosting you have.
If you have a multisite type hosting account (a hosting account that can accommodate multiple sites with different domains – or addresses) then when adding the domain to your hosting, you will have had to define the domain’s document root. When you do this, you’re actually updating the web server’s configuration file that tells it which site exists in which folder location.
At this point you might be wondering “how can I tell which site exists in which directory?”.
In cPanel, there’s a “domains” icon:
And you can click on this to see a list of domains and their respective document roots:
So from the example above, we can see I have two sites held in my hosting account:
mysite.com which has a document root of public_html (so therefore the files for https://mysite.com are served from the public_html directory).
And:
goodsite.com which has a document root of goodsite (so therefore the files for https://goodiste.com are served from the goodsite directory).
The reason I’m explaining this is so that you know, or can work out, which files relate to which site and vice versa, because at some point, you’re most likely going to need to do something to a site at file level, so you’ll need to know in which directory the change needs to be made.
The domains page in cPanel shows you the relationship between web servers and document roots specific to the domains you’re operating in your cPanel: This domain loads from this document root.
A word on “Reduce initial server response time”.
There are various page speed analysis tools such as Google’s https://pagespeed.web.dev/ and https://gtmetrix.com/ that people use to check their sites performance and load speeds.
If your site is slow there will often be a “reduce initial server response time” message.
Based on the message alone, it does sound a bit like the server needs to be sped up, or something to this effect.
This isn’t always the case.
If you have a PHP based site (WordPress is PHP based, as are most common CMS’), the PHP that the site consists of has to be executed for HTML output to be generated. Quite often this PHP will interact with a database as part of this process to be able to generate HTML output (page content is usually held in the database, for example).
So this transaction:
- Client requests site.
- Web server responds with site.
Is actually this transaction (when using a PHP based CMS):
- Client requests site.
- Web server looks up document root of site.
- PHP held in document root executes.
- PHP interacts with database.
- PHP and database combine to generate HTML output.
- Web server responds to client with HTML generated in step 5.
Steps 3, 4 and 5 are all based on what you’ve put in your hosting account as far as your site files are concerned.
So whilst the “Reduce initial response time of server” does sound like a web server thing, it’s actually, in part (at least) what you’ve deployed as far as site files and database content that influences this.
From a practical point of view, doing any of these things can cause a slow site, and cause the “Reduce initial response time of server” message to be displayed in analysis tools, to name but a few:
- Using an excessive amount of plugins.
- Using a theme with a high resource overhead.
- Calling lots of scripts (having lots of things that move on pages, like animations, or carousels).
- Having a huge database and site code doing wild card select from on the database.
- Using poorly coded components (plugins and themes, for example)
Your site code, your site’s database AND the server are effectively a team that work together to serve your site, and if any one of the “team” aren’t “plating ball” sensibly, you can see the “Reduce initial response time of server” message in analysis tools.
I’ll cover more about this in future guides covering optimisation, as that in itself can be an art, and takes quite a lot of working out.
In conclusion.
Here’s a summary of what we’ve learnt today:
- Web servers serve websites.
- There are a variety of web servers.
- The web server’s configuration file defines where sites are held.
- Sites are held in document roots.
- Your site code executes when your site is requested.
- The web server and your site’s code and database content work together to handle the request, generate page output and then serve your site.
- The relationship between web servers and document roots.