Validating JSON Web Tokens in Varnish
JSON Web Tokens, or JWT as we tend to call them, is an open standard that stores claims in an encoded JSON object. Basically: JSON Web Tokens allow you to safely store session data client-side.
How do JSON Web Tokens work?
A JSON Web Token has a structured format and consists of 3 parts:
- A header that announces the token type and the hashing algorithm
- The payload itself
- A verification signature that is signed with a secret key and that uses the algorithm that from the header
Each part is base64 encoded and the parts are separated by a dot. Here’s an example JWT:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ
When you base64 decode the first part, you’ll get a JSON object containing the header information:
{ "alg": "HS256", "typ": "JWT" }
It clearly states that this token is a JWT that uses HS256. In other words: SHA256 hashing.
The decoded payload that is stored in the second part of the token, results in the following JSON object:
{ "sub": "1234567890", "name": "John Doe", "admin": true }
The payload can contain any claim you want, but keep in mind that the more data you store, the larger your token is going to become. That’s why some of the reserved claims only use 3 characters. The example above has the sub claim. It’s a reserved claim that stores the subject of the payload. RFC 7519 section 4.1 has a complete overview of registered claims and their meaning.
The third and final part of the token contains the cryptographic signature that is used to validate the authenticity of the token. The hash is composed as follows:
HMACSHA256( base64UrlEncode(header) + "." + base64UrlEncode(payload), secret )
- Take the base64 encoded header
- Add a dot
- Take the base64 encoded payload
- Create an HMAC signature, using a secret key and the SHA256 hashing algorithm
More information can be found on the JWT.io website.
JWT advantages
JSON Web Tokens have a couple of advantages:
- Self-contained: working with JWT does not require access to the backend
- Language independent: JSON is a universal standard, nearly every programming or scripting language can process it
- Portable: because a JWT is just an encoded string, you can store it in a request header
- Secure: because of the HMAC signature, users cannot abuse your application by forging the token
The JWT signature is secure because of HMAC, but there’s also a security assumption that traffic is encrypted through TLS. Without an encrypted connection via TLS, the sensitive payload is readable to anyone who is sniffing your traffic.
The portability is very convenient:
- If you consider a JSON Web Token to be part of the authentication layer, you can store the token in the Authorization request header as a bearer token
- If a JSON Web Token is just a mechanism to store state, you can store the token in a cookie
Either way, you want the JWT to be sticky, meaning it should automatically be sent on every request.
Where does Varnish come into play?
Varnish is great for caching, but caching stateful data is tricky. Traditionally session data is stored on the server and is managed by the backend application. This means that caching goes out of the window.
But what if I told you, you could validate JSON Web Tokens in Varnish? This could potentially eliminate costly backend calls and cache pages that were otherwise not cacheable. You can even make decision based on specific JWT values in Varnish.
There are a couple of good use cases for this:
- Automatically redirect to a login page when the token format is invalid, or expired
- Perform cache variations based on certain claims
- Prohibit access to certain parts of the application when a claim is missing
Here’s some VCL
The base64 and HMAC support aren’t natively part of Varnish. You will need to install vmod_digest. Once that’s done, you can import the VMOD in your VCL code and benefit from its power.
The following snippet of code creates a custom VCL procedure that can be called in your regular VCL.
Here’s what happens:
- The JSON Web Token is stored in a cookie. The cookie is called “token”.
- Regular expression matches allow us to separate the 3 parts of a JWT
- Temporary request headers are used to store the data parts, but they’re removed at the end of the procedure
- The token type and the algorithm are extracted from the header by using the digest.base64url_decode function
- The values from the decoded header are also matched via regular expressions
- If the header does not have a JWT token type that is hashed via SHA256, an error is returned
- The payload is matched via regular expressions and also decoded into a custom request header using the digest.base64url_decode function
- The secret key is known in the JWT procedure and is used to verify the signature
- The current signature is matched with the desired signature by using the digest.hmac_sha256 function and the digest.base64url_nopad_hex function
- An error is returned when the signatures do not match
- In this case we extract the login and the username fields from the payload and store them in custom request headers
Although we’re removing the custom request headers at the end of the procedure, it still makes sense to keep some of them. The example above contains X-Login and X-Username. We can use these values in vcl_recv to make some caching decisions.
Using JWT data in your main VCL logic
Once the JSON Web Token validation part is stored separately, you can just import the procedure and call it. Here’s an example:
The idea is that we have a private page, hosted on the “/private” URL. Users who don’t have the login flag in their token, will not be granted access to this page, and will immediately be redirected to the login page. The login flag is stored in a custom X-Login header.
Without JWT support in Varnish, you would otherwise not be able to cache this page and you would request this information from the backend.
Keep in mind that this piece of VCL ignores all other cookies and caches all GET and HEAD requests. If your application depends on other cookies, the caching will be a bit too aggressive. You can either return Cache-Control: no-cache, no-store headers in your application, or you can add extra VCL code to deal with it.
What should your backend do?
Your backend will still be responsible for the token generation. There are plenty of good libraries out there for different programming languages.
If you want to use JSON Web Tokens in your APIs, you can let your backend set a bearer token. For regular browser-based traffic, setting a cookie will do the job for you.
The backend should also return to right Cache-Control headers. If you want a page to be cached for an hour, you application should return the following Cache-Control header:
Cache-Control: public, s-maxage=3600
If a page cannot be cached, the following header should be returned:
Cache-Control: no-cache, no-store
A less obvious header to set is the Vary header. This header creates cache variations. This is important, because in our example the output of the “/private” page depends on the X-Login request header that contains our JWT login flag.
Without cache variations, only one version would be cached. This would either be the unauthorized version or the authorized version of that page. Either way, this is not acceptable:
Vary: X-Login
Varnish will interpret this Vary header and create 2 cached versions depending on the value of the request header. Because we set X-Login in Varnish, it will be available for cache variations.
To avoid too many cache variations, you should only return this Vary header for pages that need the variation. In our example there’s the “/private” page. But if your header, navigation, or other parts of your website refer to the login information, they should also contain a Vary response header.
A full-blown example
I created a demo application that features cacheable code. It’s a showcase for the proper use of HTTP headers. In version 2 I also added JSON Web Tokens for authorization and session storage. I use client-side session data to prove that you can stil cache pages that require session data or that require a login.
The demo application is available on GitHub, the master branch does not include JWT, but the version 2 branch does.
Acknowledgement
I was Googling for more information on JWT and Varnish. There was only one useful resource I could find: a slide deck by Andrew Betts. These slides were my main source of inspiration. Thanks Andrew!
Read my Varnish book
I wrote a book about Varnish in which I not only explain how Varnish works, but also how developers can improve the hit rate of their application by adopting HTTP best practices.
The book is called Getting Started with Varnish Cache and is published by O’Reilly.