I've been spending quite some time talking about optimal TLS set up in the past. Let's use some of the time for this Coronavirus lockdown productively, and gain an overall view of the modern Transport Layer Security (TLS) landscape. This article will focus on TLS 1.2.
As more and more people started using it, a need for securing communications happening over the TCP protocol arose. TCP is the building block used by many other higher-level protocols (such as HTTP) to provide end-users with common capabilities such as:
Allowing a browser to load and display content from a remote server
Allowing a mobile app to communicate with an API Backend
In some cases, the communication needs to be secured from prying eyes. For example, when you want to buy an item online. In this case, you will have to submit your payment details to the retailer and you will want to keep that communication confidential for obvious reasons.
In 1994, Netscape introduced the Secure Sockets Layer (SSL) protocol to address this. It was officially released in 1995 (version 2.0), and its final version (v3.0) was available at the end of 1996. This protocol lasted until 2015 when it was officially deprecated by the Internet Engineering Task Force IETF (see RFC 7568). The reason for that can be found in the RFC document, section 4 which is aptly titled "SSLv3 is Comprehensively Broken".
There was another protocol lingering around since 1999, called Transport Layer Security (TLS), and was intended as an upgrade from SSLv3. This protocol evolved, with TLS v1.1 in 2006 and TLS 1.2 in August 2008. The latest available version is TLS v1.3 has been defined in August 2018 and represents the current "state of the art" solution.
As of today, only TLS 1.2 and TLS 1.3 are recommended, whereas all other protocol versions have been formally deprecated in 2018 by Apple, Google, Microsoft and Mozilla. Their end of support was planned for March 2020, however, due to the Coronavirus pandemic, Mozilla has temporarily re-enabled support for TLS 1.0 and 1.1 in Firefox 74 and 75 Beta, and Google similarly postponed releasing Chrome 81. This is because, sadly, many governmental sites have still not been upgraded to support TLS 1.2 and it was felt that this could have further hindered people from searching and finding useful information during the pandemic. What a time to be living in...
That's the end of my brief historic digression. Hopefully, it will make clear why the terms SSL and TLS are somewhat used interchangeably. In a sense, TLS is a direct evolution of SSL. I will only talk about TLS going forward.
Transport Layer Security - How does it work?
Let's now go back to the original problems these protocols were set to solve in the first place. The scenario is the following: there is a client, and it has established a TCP connection to a server. Client and server now want to exchange data: for example, the web page your browser is trying to load, the data your mobile app is displaying back to you, the form with sensitive information you are about to submit, and so on. We want this data exchange to be secured.
To achieve security at the transport level, we need to address the following things:
Confidentiality: client and server want to ensure that the data exchanged is readable by them, and them only. In other words, a third party cannot eavesdrop their conversation.
Authentication: either one or both parties want to ensure they know who is on the other side of the communication.
Integrity: either party want the ability to detect data communication errors or data tampering attempt by malicious parties.
TLS is the protocol that describes how to achieve these objectives for a communication happening over a TCP connection: it uses several cryptographic algorithms and techniques in a concerted effort to deliver a secure communication channel between a client and a server.
Interestingly, there is also a separate protocol for UDP streams (called DTLS), but we will keep this one for another time.
We have all learnt in recent times that we must be very careful with handshakes: we are being told that we must thoroughly wash our hands at all times and avoid practising this social rite when meeting someone. Most of us have been already mandated to avoid meeting anyone altogether, or keeping our distances when we do!
Fortunately for us, a TLS handshake 🤝does not involve any physical hand contact so we can safely dissect its inner workings without any fears.
In a nutshell, a TLS handshake works in this way:
Cipher Suite negotiation: the client and the server negotiate the protocol version and cipher suite to be used for secure communication.
Authentication: during the handshake, the client (and sometimes the server) validate that they are establishing a connection with the intended recipient of the messages.
Key Exchange: the client and the server derive a session key that will be used for the symmetric encryption of the data, once the TLS connection is established.
As a reminder, there are two types of approaches available when deploying encryption algorithms.
Asymmetric cryptography: based on a key pair (public & private), computationally expensive (learn more)
Symmetric cryptography: based on a shared secret (the session key), faster to run encryption and decryption operations (learn more)
TLS is a mixed-mode protocol, in other words, it uses both asymmetric and symmetric encryption techniques to achieve its goals.
Cipher Suite negotiation
Before we progress further, let's focus on the initial step and let's discuss what is a Cipher Suite.
In any form of communication, the parties must first agree on how to communicate. For example, this is implicit for two persons that are speaking the same language.
If, however, one person speaks English and Italian, and the other speaks only Italian, then they both need to settle on using Italian, otherwise, the communication wouldn't go very far.
In a very similar way, the client and the server must agree on which cryptographic algorithm and techniques they will use at the very start of the "TLS conversation". In this context, a cipher suite is a manifest describing a collection of cryptographic functions and techniques.
Each peer in the communication can list the cipher suites that they support, and advertise them during the negotiation phase of the communication protocol, for both parties to pick a common communication method.
As the client and server have now agreed on how to communicate, they can begin the tasks required for the secure communication.
The first step is around authentication. Going back to our example, now that both people have agreed to talk in Italian, they need to be sure that they are talking to the correct person and not to someone completely random.
In everyday's life, this is not a big problem if you are speaking to someone face to face, but it can be a problem if you are communicating over the phone, or online. If person A wants to disclose their trade secrets to person B and person B only, then they must be 100% sure that they are talking with person B before spilling any beans!
These authentication techniques ensure that one or both parties (in case of mutual TLS authentication configurations) can verify who is on the other side of the communication. This is typically achieved with certificates and digital signature verification.
The Key Exchange process is the central moment of establishing an encrypted communication. So far, the two persons have agreed to talk in the same language, and have verified that they are indeed whom they purport to be.
They now need a way to hide the content of their communication from others, preventing third parties from being able to snoop on the conversation.
Imagine being in the London Tube, and wanting to talk about your sex life in Italian with your friend sitting next to you. You might think that no one will understand you since we are in the United Kingdom and people here normally speak English. However, if another Italian Technical Architect happens to be nearby in the same carriage, they may end up being the unintended recipient of the conversation - based on a true story 🙉
Back to our scenario, the server and the client have already agreed on how they will communicate securely during the cipher negotiation phase. They have, for example, decided that they will encrypt all the data messages using AES-256 GCM (Advanced Encryption Standard Algorithm). Cool!
But, AES-256 GCM requires a secret key that is known to both parties. The client uses the shared key to encrypt the data. This is then sent on the network, received by the server who decrypts it using the same key, ultimately recovering the original message. The same happens when the server responds to the client.
How do we get the client and the server to agree on the same, shared key, without sending it over the public network where it can be stolen by others?
The Key Exchange algorithms are used to accomplish exactly that. The two main ones used are the following, although TLS 1.3 has decided to only allow methods based on the second one.
RSA: Rivest-Shamir-Adleman Algorithm (from the names of the designers)
DH: Diffie Helmann Algorithm
These algorithms allow the two parties to reach consensus around a secret session key to be used for the encrypted communication, without disclosing it to everyone else.
We will now focus on the two main versions of TLS currently in use, TLS 1.2 and 1.3 and learn about their differences.
Here is what a TLS 1.2 handshake looks like:
If you are struggling with understanding any part of my explanation, I strongly recommend looking at the following website which shows exactly what is going on in a very clear way
TLS 1.2 Cipher Suite Negotiation
In a TLS 1.2 handshake, the following algorithms (the cipher suite) must be agreed upon:
Key Exchange Algorithms: the method used to securely exchange an encryption key between client and server. The key will be used by the Data Encryption algorithm during actual communication.
Authentication / Digital Signature Algorithms: the mechanism used to authenticate the other party using asymmetric cryptography and certificates
Data Encryption Algorithms: the method used to encrypt and decrypt the data to be secured, using the exchanged key.
Data Integrity Algorithms: the method used to detect data errors or data tampering attempts for the encrypted messages exchanged, as well for key derivation.
The client will advertise to the server the combination of algorithms it supports in his "ClientHello" initial message. The server can then pick its preferred algorithm from that list and notify the client of the choice in the "ServerHello" message (or adhere to the choice indicated by the client, if it supports it). Both both parties are now aligned on how to perform the rest of the handshake and establish secure communication.
We can look at the recommended cipher suites from Mozilla and pick one as an example, Let's pick the first one from the intermediate compatibility list which is supported in TLS v1.2:
ECDSA: Elliptic Curve Digital Signature algorithm (Authentication)
AES128-GCM: Advanced Encryption Standard in Galois Counter Mode, using a key of 128-bits (Data Encryption)
SHA256: Secure Hashing Algorithm, 256-bit digest (PRF / Data Integrity)
Now, explaining the specifics of the above algorithms is not within the reach of this article. A typical exchange would look like:
In the above example, the server decides which cipher suite will be used based on the list sent by the client (the other way around is also possible). This is why it is really important to maintain up to date configurations and remove suites that are not recommended or deprecated unless there is a very strong reason not to. For example, a scenario where support from a legacy client is required, but that client can only use a weak implementation of TLS, and cannot be upgraded.
Allowing the client to choose the cipher suite is optimal: a client may have dedicated hardware helping it to perform certain algorithms over others: for example, a processor with AES instruction sets (AES-NI) will be able to perform AES computations much faster than one that doesn't have it.
TLS 1.2 Authentication
In TLS, authentication schemes have been historically hinging on three types of digital signatures:
DSA: Digital Signature Algorithm. No longer allowed in TLS 1.3, and not available by default in TLS 1.2 - we will ignore this.
RSA: The RSA algorithm can be used for both key exchange and authentication.
ECDSA: Elliptic Curve Digital Signature Algorithm. This is a variant of DSA which uses elliptic curve cryptography.
After the server responds (ServerHello), it also sends a "Certificate" message. In this message, it provides its certificate to the client. The certificate will contain the identity information (describing the owner of the certificate and other important aspects such as the domains the certificate is valid for), a public key and will include a digital signature from a trusted Certificate Authority (CA), along the algorithm used to generate it. The signature is created by the CA using the data in the certificate, the CA's private key and the chosen signature algorithm.
In other words, the CA has "observed" that the certificate belongs to a certain person or organisation, has been issued for a particular domain, and has enshrined these facts within the digital signature of the certificate itself.
The client can validate the digital signature by using that CA's public key and the digital signature algorithm. Other checks are also performed, such as the verification of the certificate chain of trust, as well as other aspects such as the certificate revocation status or the expiration status. If all goes well, then the client knows that the certificate contains valid information.
However, all it knows up until here is that the server has provided a valid certificate, but doesn't know if the server has the private key paired with the public key seen in the message. This would be what tells the client that the server is indeed the owner of the credentials and public key provided.
How this is achieved is different, depending on whether we are in the context of an RSA Key Exchange or a Diffie-Hellman (DH) Key Exchange. More on the key exchange later in this article, but for now let's focus on this:
In the RSA Key Exchange, the client will send a value (used to compute the session key) that is encrypted using the RSA public key received from the server. It follows that the server can decrypt and use successfully that value only if it truly possesses the private key part. If it can't, then the whole handshake will fail.
In a DH Exchange, the server uses its private key to create a signature of the DH initialization parameters recieved by the client, as well as its initialization parameters. This data is passed back to the client, and the client can verify it like any other digital signature, using the public key that it received from the server.
Once the above steps are done, the client has authenticated the server successfully.
With regards to the certificate itself, the choice between RSA and ECDSA is purely down to security and performance. The security of a certificate depends on the characteristics of the keypair underpinning it. A key is more or less "safe" depending on the key size and the algorithm used to create it.
As the key size increases, so does the computational effort required to run the mathematical operations underpinning these algorithms. RSA is based on factoring prime numbers, whereas ECDSA is based on solving an elliptic curve discrete logarithm problem. Without going into specifics, we know that the latter is a much more complex problem, meaning we can achieve comparable levels of security with a smaller key size.
For further details, the following article provides significant insight (as well as a quick intro to Elliptic Curve Cryptography).
RSA Key Exchange (TLS v1.2 only)
In the RSA key exchange, the client uses the information received from the server. As we remember, a Certificate including the server's Public Key was sent to the client.
The client, therefore, calculates a pre-secret (a random string of bytes) and encrypts it using the server's public key. This message is sent to the server as the Key Exchange message. The client uses the pre-secret to generate the session key and initialize the chosen encryption algorithm (for example, AES-256 GCM).
The client also sends a "Change Cipher Spec" message, indicating that it is now ready to switch to encrypted communication and a "Finished" message which is already encrypted - in our example using AES-256 GCM and the chosen session key.
Now the server receives the key exchange message and decrypts it using its private key. At this point, both the client and the server share a value (the pre-secret) that can be used by the server to generate the same session key that the client derived on its side, for symmetric encryption.
The server also responds with "Change Cipher Spec" and (encrypted) "Finished" message to complete the handshake.
RSA lacks Forward Secrecy
One of the main problems with the RSA exchange is that it does not provide forward secrecy. What this means is that as the session key (for symmetric encryption) is derived from the RSA private key, someone could do the following:
Record encrypted traffic between A (client) and B (server) on day X
Get hold of B's private key on day X + Y
Use the private key to decrypt the "Key Exchange" message, and obtain the pre-secret.
Use the pre-secret to derive the symmetric encryption key, and decrypt all the recorded data.
In other words, if the private key is leaked, then that key could be used by an attacker not only to decrypt future messages using it but also to decrypt past encrypted traffic which relied on that key-pair. This is because the key-pair is static, as it's also used for server authentication and cannot be changed every time.
That is one of the main reasons why RSA is being phased out. In TLS 1.3, it cannot be used as the Key Exchange algorithm, as only Diffie Helmann based methods are allowed.
Diffie-Hellman (DH) Key Exchange
Diffie-Hellman is a different algorithm for exchanging keys which has gained much prominence due to the above limitations with RSA.
In this algorithm, both the client and the server create a public/private key pair. The public portion of each key is then sent over the network to the other party. Finally, each party combines its private key with the other party's public key to derive the same pre-master secret. The rest works pretty much in the same way as before.
There are a couple of main differences here:
The pre-secret is never sent over the network. Each party computes it locally.
The key pair can be regenerated on each data exchange session - in which case we call it an Ephemeral Diffie-Hellman key exchange algorithm (DHE or ECDHE if based on Elliptic Curve algorithms).
The last point is particularly important as it provides forward secrecy. If the key is changed every time, then the blast radius of a key leak is much more contained to only that communication session which used it. We will see that TLS 1.3 only accepts Ephemeral mode DH to force forward secrecy on the implementation.
We should have now gained a solid understanding of TLS and in particular of the TLS 1.2 version. To be frank, there are many more aspects that we could dive into, however, I believe we have already seen enough for one article!
In the next article, we will review the capabilities introduced by TLS 1.3 and compare them with what we have learnt in this article: continue to Part 2 here.