How Web Caching Works In Cdn ( Proxy ) Chain Use Cache-Control Http Header
When web browser request a web page from original web server, the request may go through serveral intermediary proxy or cdn servers like below picture. Each node in the chain ( includes both client browser, cdn proxy server and original web server ) can cache the requested web page resources follow their own cache policy that you configured in it. So the next time you or other client request the same web page resources, the web browser, intermediary proxy server or original web server can return the cached version of the resources to the client to reduce web page load time and improve web page load performance.
From above picture we can see i have setup two cdn between original web server and the client web browser, and each node in the chain can has it’s own cache.
- Original Web Server Cache is a software that cache web resource in the web server, this cache is a shared cache, it can be accessed by multiple different client user. In my example, because my original web server is WordPress, so i install WP Super Cache plugin to generate dynamic web content’s html version in the server cache to improve server performance.
- CloudFlare or Ezoic CDN Cache is also shared cache, the cached web resources can be accessed by different client if the cached resource is not expired, but if the cached resource expires or client tell the cdn server to validate the web resoruce content, then the cdn server will reload the updated web resources from upper data source ( ezoic cdn load resources from cloudflare cdn, cloudflare cdn load resources from original web server ). But cloudflare cdn do not cache html page by default, so if ezoic cdn want to reload the html page content, it will load the html content from original web server directly.
- Client Web Browser Cache is a private cache, the cached data is used by the client web browser only. All the cached data is saved in your local computer hard disk. When the cached web resource expires, the local cache will get the updated web resource from ezoic cdn server.
1. Http Cache-Control Header.
We control cache policy use a http header cache-control. This header can be set both in request or response header. The cache-control header’s value can be following.
- public : every proxy cache server can cache a copy of the original webpage.
- private : only the client browser can cache the web resources.
- max-age : the web resource will be stale before the max-age expire. Each layer cache in the chain only promise it’s cached version is fresh.
- no-cache : before return the cached resource back to client, must validate whether the cached resource is validate or not from original web server. If validated then return the cached resources, if not then update the cached resource from original web server.
- no-store : every cache server should not store the copy of the requested web resources.
For more cache-control header values, please refer
1.2 Set Cache-Control Header In Http Response.
You should set the response Cache-Control header value at server side ( in your source code or .htaccess ). Then when you return this header to the client browser, the browser will know whether the web resource should get from local cache ( if the resource is fresh not expired ) or from original web server again ( the web resource has expired ).
For example, below cache-control header value ( cache-control: public, max-age=86400 ) will tell client browser this web resource can be returned from the disk cache ( you can see this from the Headers —> General —> Status Code field values) within 86400 seconds. If time gose by 86400 seconds since the resource’s last update, the web browser will request the web resource from upper data source ( ezoic cdn server ).
If cache-control value is something like cache-control: max-age=0, must-revalidate, no-cache, no-store. This means the client browser must validate the web resource status first, if the web resource is not modified ( upper data source return status code 304 ), then it will return the cached version back to the browser client, if the web resource is modifed since last retrieve ( upper data source return status code 202 ), then it will update the cached web resource to the newest version and return the newest version to the client browser.READ : How To Set Http Header Vary Value Correct To Make Cache Work Effective
1.3 How To Check Whether A Web Resource Has Been Modified Or Not.
To check whether the web resource has been changed or not, we use a etag http response header. This header value is a hash value based on the web resource content. It’s value is generated and reserved at server side.
If the client request provided etag header value is same with server reserved, that means the web resource content is not changed. Then the cached web resource can be used. If client and server etag value is different, that means the web resource has been changed. So the web resource need to be updated in the cache.
With etag header to verify the web resource’s freshness, this way can reduce the website traffic and improve performance.
3. Cache-Control Header In The CDN Proxy Server Chain.
As we have discussed at the beginning of this article, each node in above chain can has it’s own cache policy. Each node cache policy settings can make cache-control header’s value changed or even add extra response headers ( for example CloudFlare add CF-Cache-Status header, Ezoic add x-ezoic-cdn header ).
So you can think each upper node as the original data source of the lower node. And each node matain it’s own web resource data freshness. If the cached web resource expired then the node will request the web resource from the upper node.
Below is the ezoic caching configuration web page, from the settings you can see, you can override a lot of upper node response headers value.