Introduction to Json Web Tokens
In my previous article, I showed how to create a login page using AWS Cognito. At the end of that article, we landed on our desired web page, but with an access token appended to the URL. This article follows on from that stage, looking at the structure of the URL, and the Json Web Token (JWT) contained within it.
Introduction to JWT
What I’m going to start by doing, is using an online JWT debugger to examine the JWT we got at the end of the previous article. In case you want to follow along, I am including the full JWT so you can cut and paste it into the debugger. Apologies for the inclusion of such a dense block of text.
Here is the full URL, from the previous article:
https://www.google.co.uk/#id_token=eyJraWQiOiJjT0M3dDJEcWhHUTZuVzBDNlBMVUptRmpiS
nhLZGZ2VFlEYU50clhLVnZ3PSIsImFsZyI6IlJTMjU2In0.eyJhdF9oYXNoIjoiQzFVYkUwOGdPOGszR
HRSZXBZeldMUSIsInN1YiI6ImYyODNiYWVkLWU2ZTgtNDcyMy1hYzBmLTY5NDQzZjhjZjA4YyIsImF1Z
CI6IjRlcDNlYzNlYXQ4anEwcWViMWJmMmZ0dDd2IiwiZXZlbnRfaWQiOiJjMGI5MTM1OS02Njg4LTExZ
TgtYmUzOS1lZDAxMjEzN2Y0ODciLCJ0b2tlbl91c2UiOiJpZCIsImF1dGhfdGltZSI6MTUyNzk1OTgxN
ywiaXNzIjoiaHR0cHM6XC9cL2NvZ25pdG8taWRwLnVzLWVhc3QtMS5hbWF6b25hd3MuY29tXC91cy1lY
XN0LTFfTXNIU2NOaWpCIiwiY29nbml0bzp1c2VybmFtZSI6ImlhbiIsImV4cCI6MTUyNzk2MzQxNywia
WF0IjoxNTI3OTU5ODE3LCJlbWFpbCI6ImZvb0BleGFtcGxlLmNvbSJ9.WuIhpyM-AHiW8tonic6T7WMY
EqxpTswk_8pR9aFguNMSIK9o_SxVSCuSLwrHweXEFJ1XquKxmowIisbANH6ut8Ept9afY06oW2x44pYq
-wAo56a5O_tSO9qKtW2oZLJtWAD_VDKhTeal8NVXvJMgN5AdrjiBCeomOL2JaxFDYRw7_2-n8XHIiq3i
2QHUI91Dub4r0onZ6yJ3YGchHwNadIv3kBtoL0I6riAQHZjrIFA4oND_B_Z3WydynRBIX40WT9aTSeBT
1gXSAn6PoNYjuWquBJb4lMKaeZ6IYE5DRQXxPd6G47_SPdrwLJ4JePGw_C7ZDAio6XhbxbNVkJrdWw&a
ccess_token=eyJraWQiOiJvYXNNTVZ1NXIxWWJOTUcrc0kwXC9MZ1NUVEcyODNXWU8wdlNRamw2Z01W
cz0iLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJmMjgzYmFlZC1lNmU4LTQ3MjMtYWMwZi02OTQ0M2Y4Y2Y
wOGMiLCJldmVudF9pZCI6ImMwYjkxMzU5LTY2ODgtMTFlOC1iZTM5LWVkMDEyMTM3ZjQ4NyIsInRva2V
uX3VzZSI6ImFjY2VzcyIsInNjb3BlIjoib3BlbmlkIiwiYXV0aF90aW1lIjoxNTI3OTU5ODE3LCJpc3M
iOiJodHRwczpcL1wvY29nbml0by1pZHAudXMtZWFzdC0xLmFtYXpvbmF3cy5jb21cL3VzLWVhc3QtMV9
Nc0hTY05pakIiLCJleHAiOjE1Mjc5NjM0MTcsImlhdCI6MTUyNzk1OTgxNywidmVyc2lvbiI6MiwianR
pIjoiN2UwYWNhNjItMGQ3Yi00ODNhLWFjMjMtMmQ1OTdjYjEzNDlkIiwiY2xpZW50X2lkIjoiNGVwM2V
jM2VhdDhqcTBxZWIxYmYyZnR0N3YiLCJ1c2VybmFtZSI6ImlhbiJ9.AgPs7oWD6u132bX1wt1t9njbR0
PJyrrRrJn8Rthy6l5ORK6xThwW-WeisUp_U-WthhBoyHac9P6UaQFr10QR7-9oFaSPXjwfA3s8v5x7Z_
Yc1ntVaOwvctTXUObpCa4uTwYukphhDeEpSuC9QaB7nEiENyKQoOeSkD8OeDITt2B-laWnulnhwGpw4s
1RagIg_E6Y673rdaPGhixb9Z9mq-i4ZZ9BZa1XC6VT6giraFbRLTuVSVvIJPa3FWWlq_LuC59xkDx4xF
OeW2aD6W5UMSD1xAskdlrtz4vt5ECtxrQEriIsJ9SGipa4QxcTrWoH5BbfVS-Qd2mSUuGlF_Ou3w&exp
ires_in=3600&token_type=Bearer
Now we are going to take a token out of this full URL. This URL actually contains
two Json Web Tokens (or JWTs), an id token (starting from id_token=
),
and an access token (starting from access_token=
). I am interested in the
access token, which if I remove it from the full URL, looks like this:
eyJraWQiOiJvYXNNTVZ1NXIxWWJOTUcrc0kwXC9MZ1NUVEcyODNXWU8wdlNRamw2Z01Wcz0iLCJhbGci
OiJSUzI1NiJ9.eyJzdWIiOiJmMjgzYmFlZC1lNmU4LTQ3MjMtYWMwZi02OTQ0M2Y4Y2YwOGMiLCJldmV
udF9pZCI6ImMwYjkxMzU5LTY2ODgtMTFlOC1iZTM5LWVkMDEyMTM3ZjQ4NyIsInRva2VuX3VzZSI6ImF
jY2VzcyIsInNjb3BlIjoib3BlbmlkIiwiYXV0aF90aW1lIjoxNTI3OTU5ODE3LCJpc3MiOiJodHRwczp
cL1wvY29nbml0by1pZHAudXMtZWFzdC0xLmFtYXpvbmF3cy5jb21cL3VzLWVhc3QtMV9Nc0hTY05pakI
iLCJleHAiOjE1Mjc5NjM0MTcsImlhdCI6MTUyNzk1OTgxNywidmVyc2lvbiI6MiwianRpIjoiN2UwYWN
hNjItMGQ3Yi00ODNhLWFjMjMtMmQ1OTdjYjEzNDlkIiwiY2xpZW50X2lkIjoiNGVwM2VjM2VhdDhqcTB
xZWIxYmYyZnR0N3YiLCJ1c2VybmFtZSI6ImlhbiJ9.AgPs7oWD6u132bX1wt1t9njbR0PJyrrRrJn8Rt
hy6l5ORK6xThwW-WeisUp_U-WthhBoyHac9P6UaQFr10QR7-9oFaSPXjwfA3s8v5x7Z_Yc1ntVaOwvct
TXUObpCa4uTwYukphhDeEpSuC9QaB7nEiENyKQoOeSkD8OeDITt2B-laWnulnhwGpw4s1RagIg_E6Y67
3rdaPGhixb9Z9mq-i4ZZ9BZa1XC6VT6giraFbRLTuVSVvIJPa3FWWlq_LuC59xkDx4xFOeW2aD6W5UMS
D1xAskdlrtz4vt5ECtxrQEriIsJ9SGipa4QxcTrWoH5BbfVS-Qd2mSUuGlF_Ou3w
A JWT is made up of three parts — the header, the payload, and the signature. Each of these is a base64-encoded string, separated by a dot. You can see this structure in the above example.
The header contains information about the JWT, such as the hashing algorithm used, keys to use, etc. The payload is the information we are actually interested in (who is the user, what they can do, etc). The signature is the part which allows us to check that the message hasn’t been tampered with in transit.
To see this in action, we can use a JWT tool, such as the jwt.io debugger. If we paste the above above JWT into the debugger, we get a page like this:
On the left hand side, you can see the JWT we pasted in, which has been colour-coded to indicate the three parts of it. On the right hand side, you can see each of the three elements decoded into their own boxes. At the bottom, you can see a big red piece of text saying Invalid Signature. We will come back to this later in the article.
Let’s take a look at the header:
Here you can see the hashing algorithm (RSA 256) and also the kid, which is the key identifier. The kid allows us to choose a specific key, when there are several available. We will see this later, once we get into writing some code.
Now onto the payload itself:
If you refer back to the previous article, you should be able to understand where many of these fields come from. The iss is the URL for the identity provider (i.e. the base URL we used for the login). The client id is the id for our application, which was created when we created the application in the user pool. The username is the user I set up in the user pool, then logged in with. The token use tells us this is the access token. Then there are various timestamps, and some other information. We will use these elements later, as part of our validation process.
Finally, let’s look at the signature:
This is not actually useful for us, since our validation has failed, but it shows us how the signature works. It takes the base64-encoded header and payload, concatenates them (with a dot separator), then applies an RSA 256 hash using a public key. If this matches the signature from the JWT, we know that the JWT hasn’t been tampered with.
As we saw earlier, we can’t validate this message (because we don’t have the public key), so we get a warning:
If you paste in the appropriate public key, you will get a success message:
I won’t discuss how to do this here, but if you understand the following article, you should be able to work out what you need to paste into the signature box to get this verification message.
But before we go any further, it’s worth giving a brief description of how signatures work. If you already know this, feel free to skip this section.
Public Key Signatures
The signature for a JWT, is similar in intent to signing a letter — it lets the person reading it confirm the identity of the sender. It also has an additional purpose in a JWT — it confirms that nobody has tampered with the contents. You can see why this is important for a JWT. I want to confirm that the user logged in via one of my approved identity services, and I want to confirm that nobody has intercepted it, and modified it to give themselves permissions they shouldn’t have. To make this workable in practice, we use something called public/private key signing.
Digital signatures typically use some sort of hashing. They take the message, add onto it a key (think of it like a password), and then apply a mathematical operation. This gives a result (a number or a string), which is the signature. When the message is sent, both the message and the signature are transmitted — in JWT terminology, that is the payload and the signature. We can then apply a mathematical operation to the message to generate the signature for ourselves. If the signature we generate matches the signature we received with the message, then we can have confidence that the message was sent by the person claiming to send it, and that it hasn’t been tampered with.
The only problem with this approach (as I’ve described it) is that we need the password to generate the signature. And obviously, the password is meant to be a secret. This is where the public/private key comes in. There are mathematical operations which can be applied using one key, then validated using a second key. This allows someone to keep their key secret (i.e. it is a private key) so they can continue using it to sign things, but they can publish the second key (the public key) so that anyone can verify their signature.
This is all good, but there is one final problem to solve — how do we know that we have the correct public key? If an adversary can give us a fake public key (that matches their own private key), then they can still send us messages that have been tampered with. So, to make this all work, we also need a way of being sure that the public keys are authentic.
So, to summarise, we have four parts we need to worry about:
- The mathematical algorithm used (e.g. RSA256)
- The private key — only the originator of the token should know this
- The public key — this should be available to anyone who wants to verify the token
- Authenticity of the public key — we need a way of knowing we are getting a valid public key
In our example, these are handled as follows:
- The algorithm is specified in the JWT header
- The private key is somewhere within the internals of our AWS Cognito service
- The public key is available using an http request — we will see this in the next section
- We know that we have a valid public key, because we
GET
it from a sub-directory of our AWS Cognito identity provider
Summary
That concludes a very brief overview of JWTs. The next article will get back to practial concerns, and will look at how we can use Javascript to parse the JWT.