In the last part of the blog series we have discussed about the basic concepts of cryptography. It includes Hashing, Symmetric and Asymmetric encryption and so on. I haven’t spoken anything about SSL or TLS except their history. I hope we are done with the foundation. So let’s dig up the real stuff. In this blog post we will discuss about the key exchange algorithm in terms of Diffie-Hellman key exhange.
Understanding the type of encryptions in SSL
From the last part of the blog series, we know that both Symmetric and Asymmetric encryption are used in SSL. We will see which, where and why.
Imagine that you are browsing Facebook. Facebook by default reroutes all your traffic via https. Since you are using a TLS(I will use TLS instead of SSL from now on most places, because it is the standard now) connection, you would see a Green box on the URL bar confirming that the connection is secure. During a single session, you will be doing multiple activities such as commenting, chating, navigating between pages, scrolling news feed,etc. Each time you do this, multiple requests and responses are shared between the client and server. All of this communication has to go through https in order to ensure data security. That means the server and client browser are encrypting and decrypting the data packets 100’s of times for a single Facebook session.
We know that Public Key encryption is more secure than Symmetric key encryption since the Decryption key is never shared with anyone. But if we know, in Public Key Infrastructure (PKI) the overhead is very high. It uses more CPU and takes more time to encrypt and decrypt as compared to symmetric key cryptography. As a result, the browser(and the application server) start to eat your CPU resources. Moreover, the browser will take more time to serve the content since it has to undergo the hectic cryptographic steps each time.
What is the solution?
So, we use symmetric encryption to achieve this, which is more faster and less resource consuming. All good. But the client and server has to agree upon a single secret key before starting encryption, right? How would they do this? While sharing the unique secret key, an attacker sitting in between the client and server can capture it and Kaboom! All your data is gone. So, there has to be a workaround to share the secret key and there we make use of Public Key cryptography.
Agreeing upon a single secret key and sharing it between the client and server is called the Handshake, and it is the first step in TLS. There are multiple processes involved in a Handshake. The whole process together is called Public Key Infrastructure. Remember we used the term in the last part of the series? PKI includes the Certificate Authority (CA), Digital signing, etc. We will discuss the infrastructure in depth below.
Algorithm of key exchange
So, it is clear that asymmetric encryption is used to exchange the keys, but which algorithm? Many algorithms were proposed since the invention of Asymmetric cryptography. During the time of writing this post TLS 1.2 is the commonly used standard and RSA, Diffie-Hellman key exchange ,ECDH(Elliptic Curve Diffie-Hellman), SRP(Secure Remote Password), PSK(Pre Shared Key) are the key exchange algorithms supported by TLS 1.2.
It would probably a bad idea to discuss all of the algorithms here. Instead we will discuss the most common and easily understandable Diffie-Hellman key exchange algorithm.
Diffie-Hellman Key Exchange explained
I am not directly going to the math, because I’m weak in it. Instead, let’s try to understand the concept in terms of a color analogy. Imagine that Alice and Bob were doing some poster work. Their rival Mallory is also sitting next to the bench. Alice and Bob wants to agree upon a single color to design the poster. They can’t discuss it loudly because Mallory would hear it. Then how would they reach upon a common color? The solution is the simplest form of Diffie-Hellman key exchange algorithm. Let’s see how.
- First of all, Alice will choose a common color,say Yellow and tell Bob that she’s going to use Yellow for this session. Obviously, Mallory would hear it, never mind.
- Alice and Bob now choose their own secret colors and they will not share this between each other. So Mallory will never know the secret colors. For example, Alice choose the secret color Orange, and Bob choose Green.
- In this step, Alice will mix her secret color Orange and the common color Yellow to produce a new color. Sandal color it is right? (My color sense is not that good, pardon me.)
- Similarly Bob will also mix his secret color in Yellow to generate the new color Blue.
- Alice and Bob will share these new colors between them. Mallory can see the Sandal color and the Blue color but not their secret colors.
- Once the exchange is completed, Alice will mix her secret color (Orange) into the mixture sent by Bob. And Bob will mix his secret color (Green) to the mixture sent by Alice.
- Now both Alice and Bob reached a mixture of a common secret color. Please refer to the figure below. Mallory will be stuck with Sandal and Blue and it can never reach the common secret color since Mallory doesn’t know the Secret colors of Alice and Bob.
Here, the Common paint(Yellow) can be considered as the Public key of the server, which will be available to everyone. The common secret obtained at the end can be considered as the symmetric key which is used to encrypt the data in further sessions. This is not exactly true, but for the basic understanding let’s keep it as such. You would understand the precise logic if you dig deeper.
Math behind Diffie-Hellman key exchange
Let’s find out the basic mathematics behind the above explained algorithm. We need to have an idea of Modular Arithmetic to better understand the concept of Diffie-Hellman. Those who don’t want the math can skip this section, others please follow me.
You know that, when you add 7 and 8 you get 15. It is normal arithmetic. But this is not true in case of a 12 Hr clock. If the time is 7 o clock now, then 8 hours later, the time will be 3 o clock. So, we can say that a clock is the simplest example of modular arithmetic with arithmetic modulo 12. In this case, we know that 12:00 is similar to 00:00 and hence we can say 12 is congruent to 0 and vice versa.
A = b(mod p)
If we take the value of p as 12 and b as 21. Then,
21 (mod 12) = 9
Let’s convert this to Diffie-Hellman example. Keep the colour analogy in mind while reading the below stuff. Imagine that both Alice and Bob knows the values of g and p or Alice previously decided these values and send it over to Bob. In other words, these values are public. Now,
Alice (a is private) calculates her public value:
A = ga(mod p)
Bob (b is private) calculates his public value:
B =gb(mod p)
They exchange A and B (both public).
SA=ga×B =ga×gb(mod p) =gab(mod p)
SB=gb× A = gb× ga(mod p) = gab(mod p)
Observe that SA= SB=K. This is the session key which will be used to encrypt the session.
Chances of Mallory getting the secret key
In the whole process, note that the secret of Alice(a) and secret of Bob(b) are never shared between each other. So, Mallory will know only g,p, A and B. To get the value of K, Mallory first needs to compute a & b from A =ga(mod p) & B =gb(mod p). It is called Discrete Logarithm problem. For large values of p, this mechanism is found to be nearly 100% infeasible. In a practical TLS implementation, the length of p would be in the range of 1024 or 2048 bits. That is, a 2048 bit key will have a length between 22047and 22048. Hope you know that a 23 length key can have maximum value of 8. Imagine the complexity of a 2048 bit key then.
When such a key is used, the largest supercomputers in the world would take 100s of years to compute a & b. But, these values changes with each session. So, even if an attacker calculated this value, he cannot use them to impersonate the users in the following sessions. That is called Perfect Forward Secrecy.
Are we safe now?
The server and client browser have agreed upon a secret key which is securely shared via a strong key exchange algorithm. Everything looks fine. But wait, am I safe enough? Let’s imagine a scenario where we are trying to connect to facebook.com via https. Assume that an attacker is already sitting between your browser and Facebook server. Your browser will tell the Facebook server to initiate a TLS channel. But the attacker can setup a server of his own and reroute all the communication between you and Facebook.com via his server. So that, when the Facebook server sends its Public key, the attacker can substitute it with his public key and forward it to you. Don’t you still get it?
Get to the next step. You receive the public key thinking that it is actually coming from Facebook.com, your browser will encrypt your secret key with it and send it back to Facebook. Again, the attacker will grab it and guess what? He has the corresponding private key to decrypt the secret key and later encrypts it with the original value of Facebook.com’s public key(which he already have) and forward it back to Facebook.com. He will keep doing this encryption-decryption process and he can see everything that’s being shared between you and Facebook.com.
Damn, what now?
The answer to the problem is CA (Certificate Authority). In simple terms, Certificate Authority was specified by X.509 standard to ensure the data integrity. Data integrity ensures that the data in transit is not tampered by a third party entity. In other words, the CA act as a middle man between your browser and the server. It’s the CA’s job to ensure data integrity.
We will discuss about CAs in depth in the next blog post. If you find the article useful, please consider donating me. To do that, you can simply click on any of the ads shown in the page and browse through the ad for few seconds. Thanks.