Cache is one for the most talked topic now a days. Everyone must have worked once in his/her life with Cache to make things faster. Browser Cache is one of the Cache concepts those are helpful in make your website faster and scaleable. Browser Cache works on the base of Cache Headers that we send with our Content. Good New is, every browser has in-built implementation of an HTTP Cache. But we have to make sure that we provide correct HTTP Headers to tell browser for WHEN and HOW LONG the response should be cached.
Why Cache Headers ?
Now the Question is Why should we use Cache Headers ? Well as we all know, in Web sites most of the times the bottleneck is Data transfer over the network. Web cache works between the Web Server and Client (i.e Web browser in our case), when content is cached, it saves the copy of responses i.e. HTML Page, JS, CSS, Images etc on the local storage and when the same url is requested again, instead of fetching the url’s content again, the content is served from the cached content itself. So main reason for using Cache is: To reduce latency, as data is served from the cache rather that being fetched over network. To reduce network traffic, as content is reused for the next request so there is no new connection to the server. To scale servers to serve more users, as users will be using the content from the cache, then server will have more room to serve other requests.
How Web Cache work ?
All Cache Headers work on the base of set of rules and protocols defined by HTTP 1.0 and 1.1, but some are set by administrator of cache servers. But there are common set of rules those are followed by almost every browser and server.
Response will not be cached if it is asked not to be cached. A resource is considered fresh if, we have set its expire or other age controlling param and it’s still in fresh period.
If cached resource is stale, Client will send request to validate if it can be used to server, which will then respond back with appropriate response.
If no validator (an ETag or Last-Modified header) is present on a response, and it doesn’t have any explicit freshness information, it will be mostly considered uncached.
So we can conclude Freshness and Validation are very important in caching and fetching the response from server.
We have talked about Cache Headers since the starting of this blog, So no lets check what the headers that are actually required and how they work.
This HTTP header tells that how long the response is fresh for and After that time, caches will always check back with the origin server to see if a document is changed. The time in a HTTP date is Greenwich Mean Time (GMT), not local time. So its very important that we make user our Web Server clock sync. The best way to do so is to use NTP (Network Time Protocol). Example:
i.e the browser will not ping the server before this given time.
This response header can be categorised as “super-header for caching”, as this gives more control to the developers over their content, but overcoming the limitation of “Expires” header. Following are the values those can be used with this header.
Public defines that, content can be cached on the client, even if it is authenticated. Example:
Private defines that, response can only be cached for single user, not by the intermediate proxies. i.e, CDN, can’t cache these response but user browser can cache it. Example:
No-Cache defines that, the response must be validated from origin server before it can be used, even if it’s cached on the browser. So if proper validation is implemented, this header will ping server to check validation, but will skip the downloading if content is not changed. Example:
No-Store defines that, content should not be cached under any condition and it should be fetched from server always. Example:
Max-Age defines, the maximum number of seconds the response can be cached or considered fresh and client can re-use the cached copy before that time, without sending request to the server. Example:
S-Maxage defines, the number of seconds shared caches i.e CDNs can cache the content, it same as “max-age”, but is only valid for shared caches. Example:
Must-Revalidate defines that, all the provided info must be strictly followed. It basically tells the client that, it should obey the cache rules those are sent here.
No-Transform defines that, no proxy or shared cache is allowed to encode data. Actually some proxies convert and encode the data to improve the performance, so this headers tells them not to encode data that is being transferred. Example:
Proxy-Revalidate is, same as “must-validate”, except it’s only valid for proxies. Example: Cache-Control: proxy-revalidate. We can use all of these in combination. i.e.
When both Cache-Control and Expires are present, Cache-Control takes precedence.
Etag (or Entity-tag), is a unique identifier for the resource that is being requested from the server, mostly it is the hash of the content of resource or hash of updated time for the resource. It is used as validator, to verify if the content is modified or not. ETag is controlled by server, so browser is not concerned how it is generated. But Once the cached resource is expired, browser sends the old Etag to the server to verify if content is updated, if the Etag is same, then resource is served from Cache, as browser assumes its not modified, but if Etag is different, then client downloads new resource from the server. Example:
Content-type header tells the client what type of content is being transferred from the server. It helps that client to parse the content accordingly. Example:
Well, now we know about the cache headers and understand how Freshness and Validation play key role in making the content cacheable on client, following are the tips that will help you make cache aware site.
- Determine the best policy of your resource to be cached. Ideally it should be as long as possible. But it depends on type of the content like image, css, js and on how frequently your content get modified.
- Make sure we provide valid validation token with the resource as it can save much content from unnecessarily being downloaded from network.
- Don’t delete and add static files unnecessarily as it will change the modified time for that file and can force client to re-fetch them.
- Always try to compress the Content while transferring it, CSS and JS should be minified and GZip compressed for faster transfer.