In the second part of the series we have discussed about the SSL/TLS Key exchange in terms of Diffie Hellman algorithm. We ended up thinking the need of a third person who could verify the authenticity of the server. And the name Certificate Authority was proposed. Key takeaways from the last 2 parts of the blog series:
- TLS encrypts the Client-Server communication and prevents Man in the middle attackers.
- Difference between Encoding, Hashing and Encryption
- TLS uses Symmetric key encryption to encrypt data and Public Key Infrastructure to exchange the Symmetric Key.
- The key exchange algorithm itself can be spoofed by an attacker. Hence we need a trustworthy authority to verify the authenticity of server.
What is the need of Certificate Authority?
Imagine that a client browser is trying to communicate with a web server and it wants to initiate a TLS channel. From the last point above, in order to prove the identity of server, the client browser must have the public key of the server. However, its impossible to store the public key of all the websites we visit in the browser. Since newer websites are given birth each minute, an update would be required every minute to do this.
The solution is Certificate Authority. When we install our browser or OS it will be shipped with a set of Certificate Authorities. One such example is DigiCert. When I say DigiCert is shipped with the browser, it means that the browser has the Public key of DigiCert. Websites can request certificates and signature from DigiCert. So DigiCert will do a Cryptographic signature on the server certificate with DigiCerts private key. The server will be sending this certificate embedded with its public key when we initiate a connection. Since the browser has the public key of DigiCert, it can verify DigiCert’s signature on the server certificate. And so, the public key of server which is written on the certificate is also trusted.
Don’t worry if you didn’t completely understood the concept. Let’s break the process into pieces and analyse one by one.
What is Digital Signature?
To understand the concept of Certificate Authority let’s go back a couple of decades and consider the analogy of traditional post box mailing system. Imagine that Alice owns a company and Bob is an employee of the company. Alice wants to send Bob a confidential mail. Alice, as the CEO, will draft it and place it into the post box. It will travel through several post offices and several postmen and finally reaches the hand of Bob. Bob can now open it and read it. But how Bob can ensure that the mail is actually coming from Alice? There are 2 possibilities here:
- An attacker,say Eve, can draft the mail with insensible contents, set the From address similar to Alice’s office and forward it to Bob.
- Eve can be a middleman, for example an employee of the intermediate post office, and he can open the mail before it reaches Bob. He can even rewrite the contents as he wants, stick it back and send it back to Bob.
In both the cases, there is no way to ensure that the received mail from Alice is legitimate or not. In such cases, what would we do? Yes, Signature. Alice can use the Seal and Signature while posting the mail to Bob. Here, Alice’s company seal can be used to verify the authenticity and integrity of the email. Since Alice’s company is a recognised entity, we can trust the mail if it has their signature. This is exactly what a Certificate Authority do.
Technical aspect of Certificate Authority
We know that PKI is used to exchange the session key in TLS protocol. This process can be called as the authentication process. In order to carry out the authentication process, the server needs to send the public key with client. But an intermediate attacker can grab this public key and replace it with his own public key. That is dangerous, because the client will never know the public key was tampered during transit. The client would unknowingly encrypt the symmetric key with public key of the attacker and forward it. Since the attacker holds the corresponding private key, he can decrypt it and steal the data.
In order for the client to trust the Public key being received, concept of CA was introduced. The working of CA is as follows. Imagine that the server https://example.com needs a TLS certificate.
- Server example.com will request a TLS certificate from a CA. For example Digicert.
- Digicert will create a certificate for example.com. The certificate will contain the necessary data such as the server name, public key of the server etc.
- Digicert will create a hash of the data(certificate) and encrypt it with their own private key.
- Browsers and OS comes shipped with the public key of Authorities such as Digicert.
- When the browser receives the signed certificate, it will use the public key to generate the hash from the signature. It will also generate the hash of the data(certificate) with the hashing algorithm specified in the certificate. If both the hashes match, signature verification is success and the certificate is trusted.
- Now browser can continue to the authentication process with the public key of example.com specified in the certificate.
Here, we can call Digicert a Root CA.
What happens if the attacker tampered the certificate?
When the certificate is received, browser will verify data such as Server Name, validity of the certificate, signature, etc. Imagine if the attacker used his custom certificate instead of example.com’s certificate. Then the Server Name field validation will fail and the browser will immediately drop the connection.
Another case, if the attacker keep all such data as such and replace just the public key with his public key what will happen? In this case, when the browser tries to regenerate the hash from the data of certificate, he will get a different hash value since the data was tampered. So, there will be a mismatch of hashes computed from the data and signature.
To bypass the above mechanism, the attacker would need to make the signature match with the data. In order to do that he need to have the private key of Digicert (who originally issued and signed the certificate for example.com). Attacker would fail at this point since the only signature he can create is from his private key. This will not be trusted by our browser. The browser’s certificate store will not have the attacker’s public key and it will show a certificate exception when such an attack occurs such as shown below.
You would have probably noticed this while trying to set up proxies for your browser. Privacy error happens because the proxy tool acts as a man in the middle and displays its own certificate to the browser. If you trust the certificate, then you can either click proceed by showing your trust. Or you can download the certificate of proxy tool and add it to the trusted authorities list inside your browser. That way, you can see the encrypted data in plain text inside the proxy tool.
Chain of Trust
We know that the Certificate Authority creates and signs the certificates for a server. There are very few number of organisation doing this job, namely Digicert,Geotrust,Comodo,etc. If they are signing the certs for all the servers, the same private key has to be used for all the signatures. If it were stolen, then all the trust is lost. In order to solve it and to add more entropy, the concept of intermediate CA was introduced.
The idea is simple. Charles is trusted person who used to sign the mails from Alice. Bob will trust the mail if he sees the signature of Charles. Now, Smith is another person who is trusted by Charles. Now, if Smith signs a mail from Alice on behalf of Charles, then Bob would not usually trust it. But here comes the idea chain of trust. Bob trusts Charles and Charles trust Smith. Hence Bob can trust Smith. Similarly An intermediate CA is a Certificate Authority trusted by the Root CA. The certificate for example.com will be issued by the intermediate CA. The intermediate CA will also have a certificate which will be signed by the root CA. And only the Root CA details will be stored in the browser’s certificate store.
So during certificate validation, the browser trusts Digicert Root CA. Digicert Root CA trust the intermediate CA and hence the browser can trust the intermediate CA. In the image below you can see the hierarchy, DigiCert SHA2 High Assurance Server CA is the intermediate Certificate Authority and DigiCert High Assurance EV Root CA.
One another advantage of this hierarchy is that the Root CA need not be always online.
Mathematical algorithm of digital signature
We have discussed Diffie-Hellman algorithm while understanding the key exchange process. Similarly there are many algorithms available for digital signature as well. This will be specified in the server certificate. See example.com’s certificate below.
I am not going into the core math since it is boring and moreover I am weak in it. The certificate shows SHA-256 with RSA encryption. RSA is a popular signing algorithm which we will discuss here. Just like any other asymmetric encryption algorithm RSA also has public-private key pairs. A difference here is that, the signing(consider it as encryption) is done by using the private key of the Intermediate CA. And the signature verification(consider it as decryption) is done by the browser using the corresponding public key. In other words, RSA signing is not RSA decryption. If you are interested in making a practical RSA signature, refer here.
RSA will hash the certificate before signing it. There is a significant reason for that. If you understand the algorithm in depth, you will know that RSA cannot encrypt the data if the length of data is longer than its key length. Suppose we use a 2048 bit key for encryption, then the certificate data should not exceed 2048 bit a.k.a 255 bytes. This isn’t always feasible, since the certificate contains so many information. So before encryption, a hash function is applied over the certificate which generates a unique random string of specified length. In case of example.com, SHA-256 hashing algorithm is used for this. You can research more on this limitation of RSA if you are interested.
How does the browser actually verify the validity of a given server certificate?
We know that the server uses a signature from the Intermediate Certificate Authority. So, while communicating with the browser two certificates will be shared by the server. One, which contains the public key of the server which is the actual server certificate. And second, the certificate of intermediate CA which is issued by the Root CA. Here is a pictorial representation of the verification chain.
During signature verification, browser first verify the digital signature of intermediate certificate using the public key of root CA, which is already stored in the browser. If it is succeeded, browser can now trust the Intermediate certificate and its Public key. Now using this public key, browser will verify the signature of original server certificate. Organization can register as an intermediate CA to sign certificates for their domain. One such example is Google.
Google Internet Authority G3 is an intermediate CA trusted by GlobalSign Root CA – R2. This means, Google can verify their domains with this Intermediate CA. The browser will trust them since the browsers are shipped with GlobalSign Root CA. It has to be noted that Google is authorised to sign for their domain alone. This prevents Google from signing certs for Microsoft.
So far we have discussed the theoretical working of certificate authority and TLS protocol. In the next part of the series we will actually inspect the entire TLS communication.
If you like reading so far, please help me grow by clicking and browsing any of the ads shown in the blog. Thanks a lot!