CDN: when and how to use one?

Ramón Saquete

Written by Ramón Saquete

Using a CDN (Content Delivery Network) is highly recommended when we want to improve the performance of a website, especially if it receives visitors from various locations. However, depending on how the website has been programmed and the type of information it displays, we can take more or less advantage of the features this type of service provides.

What is a CDN?

Technically, CDNs are very powerful inverse proxy networks, directly connected to the Internet’s core routers (we had previously explained what inverse proxies were). In other words, they are very fast servers, where user-requested content is cached. Given that they are directly connected to the core or very close to the primary Internet routers (Tier-1), they can respond much more quickly than most servers, on which requests usually need to hop through a larger number of less potent routers. Moreover, this also allows them to use the Anycast technology, which consists in using one unique IP address for all CDN proxy servers, so that whenever a request comes through, it is responded by the nearest one to the user’s IP.

Here’s an overall diagram of how a CDN works:

CDN

This example shows 7 clients simultaneously requesting the same file to the website, but only 3 of them reach the server of origin. This happens because the first request arriving at each node stores the response of the server of origin on the cache. This way, the following request of another client doesn’t have to travel to the source, with the node directly responding from the cached content.

Geographic Location 2 doesn’t get any improvement, with only one unique client making the request. We won’t see any improvements for clients requesting different files to the same node for the first time either.

What’s the use of a CDN?

CDNs are useful for:

  • Getting cached requests to be returned faster, improving the website’s speed.
  • Getting the service closer to the user’s location, so that files are returned faster because they are on a more potent server or with an already generated HTML code. Latency is also lower, due to a shorter distance, therefore resulting in less hopping between routers.
  • Freeing up an overloaded server, by keeping a more stable number of requests.
  • Mitigating attacks: it is one of the best ways to prevent DDoS (Distributed Denial of Service) attacks: they consist in throwing a massive amount of requests from various locations to our website in an attempt to block the server.

When users visit us from an individual location we don’t need to get a CDN with many nodes, because with just one inverted proxy with Varnish or similar, or with just one CDN node we can get all the previously mentioned features, which includes getting the service closer to the user if the server of origin is not located in the same country.

When is a CDN cache not being used to its full potential?

Ideally, the CDN is going to allow us to cache all our website’s files, which we are going to separate into static files (images, CSS, JavaScript and fonts) and dynamic files (HTML). The latter are the most important ones for performance. Ideally, they must be cached by the CDN whenever possible (even if we already have an HTML cache on the server of origin). There are situations where we’re not going to cache all our HTML to CDN nodes. We explain them below:

  • High update frequency: if our website displays dynamic content updated each second, we won’t be able to use a CDN to cache HTML files, as high frequency updates will result in constant cache errors. If, on the other hand, content is updated every X number of hours or days, we can use a CDN. There is also the possibility of the CDN allowing us to set certain rules with regard to the update frequency for each URL type.
  • Many locations with few users: if we have locations with few users, and the HTML cache regenerates frequently, using said cache can be counterproductive. It is possible two users may never request the same page, which another user requested from the same location, as shown in Location 2 in the diagram above. Thus, always, or almost always, there will be a cache error, which will result in the request being returned from the server of origin to the CDN, and finally to the user, taking longer than if there was no CDN, and the request went directly to the server of origin.
  • Responsive website: another scenario where we might not be able to cache HTML is when we have a responsive website. Responsive websites change their content based on the browser’s user-agent. This forces the CDN to cache a different page for each user-agent string, increasing cache errors and service costs.
  • Online shop or private area: when a website has to display different content depending on the user who signed in, we cannot cache such content on the CDN, meaning all requests getting to the CDN with a user login or shopping basket cookie should be returned to the server of origin. If the CDN was to return cached content from one user to another different user, this would result in user A seeing the data of user B. Other pages can be cached normally. If there is a shopping basket, it is recommended to load it though AJAX, otherwise, we won’t be able to cache any HTML of our website. For the CDN to be able to perform this type of actions, you might need an advanced plan.
  • Calls through POST storing information on the data base: when a user sends their data, after filling a form, this data should reach the server of origin to be stored on the website’s database. The CDN must provide the required means for it to work as necessary, and for the lead to not get lost within the CDN node.

In what cases we can make the most out of a CDN cache?

If we hire a CDN, we’ll always be able to cache all static resources. However, depending on each particular case, we might not be able to cache the HTML in full, as seen in the previous point. We will only be able to do it when we get a lot of visits, a low update frequency, and content not depending on user-agent, or user login. If these conditions are met and our target audience comes from different countries, then we can be sure that a CDN is the ideal solution. If any of these factors is not met, we might benefit from it also, but we will have to consider each scenario separately.

Moreover, to make the most of a CDN during the implementation, we shouldn’t use different subdomains for each type of loaded resource. This technique, called domain sharding, has become obsolete with the arrival of HTTP/2, so it is best to load all resources from the same domain as the one used for the website.

How to choose a CDN?

To choose a CDN, we must ask ourselves the following questions, many of which should be answered by the development team or system administration team, and they can affect the price of the current plan:

  • Do I have nodes in countries where my target audience is? We can see the nodes of the four main CDN services in this map. In this case, if the CDN doesn’t have nodes where the service is provided, we could even be moving the service away from our users, if the server of origin is located in the same country as they are.
  • Do I get enough traffic to take advantage of the CDN’s cache? If we don’t have much traffic, it will be enough to just cache the HTML in our website’s own server.
  • Will the update frequency of my content allow me to cache HTML? We could probably choose a cheaper plan if we’re not going to use this type of optimisation.
  • Do I need to discriminate traffic by user-agent? We must see whether the CDN allows it, or if the code can be modified to prevent responsive content. In this case, the cost may increase considerably.
  • Do I need for the CDN to detect specific cookies? We need to know whether the plan we are going to hire with the CDN will allow us to send traffic to the origin with a logged in user.
  • Do I need to exclude specific pages? Management areas or form sending should be excluded from the HTML cache.
  • Does my website have files types the use of which is not allowed by the CDN? The CDN might not allow certain MIME types, reason why we might not be able to return some files necessary for the website.
  • What additional features it has? For instance: HTTP/2, HTTP/2 Server Push, image compression, static resource compression with Brotli q11, code optimisations, API for forcefully deleting the cache from the website’s admin panel, final cache header configuration, allowing the user to browse from the cache if the origin is down, security filters, etc.

Conclusion

A CDN is very useful to accelerate performance in consolidated projects where there’s a significant amount of traffic, especially when we provide services or products in several different countries. However, while we should keep in mind business needs, we should also consider technical requirements to choose the most appropriate option. We must always consult our web developers and ask any questions we might have to the CDN support.

Ramón Saquete
Autor: Ramón Saquete
Web developer at Human Level Communications online marketing agency. He's an expert in WPO, PHP development and MySQL databases.

Leave a comment

Your email address will not be published. Required fields are marked *