Skip to Content [alt-c]

May 18, 2022

Parsing a TLS Client Hello with Go's cryptobyte Package

In my original post about SNI proxying, I showed how you can parse a TLS Client Hello message (the first message that the client sends to the server in a TLS connection) in Go using an amazing hack that involves calling tls.Server with a read-only net.Conn wrapper and a GetConfigForClient callback that saves the tls.ClientHelloInfo argument. I'm using this hack in snid, and if you accessed this blog post over IPv4, it was used to route your connection.

However, it's pretty gross, and only gives me access to the parts of the Client Hello message that are exposed in the tls.ClientHelloInfo struct. So I've decided to parse the Client Hello properly, using the golang.org/x/crypto/cryptobyte package, which is a great library that makes it easy parse length-prefixed binary messages, such as those found in TLS.

cryptobyte was added to Go's quasi-standard x/crypto library in 2017. Since then, more and more parts of Go's TLS and X.509 libraries have been updated to use cryptobyte for parsing, often leading to significant performance gains.

In this post, I will show you how to use cryptobyte to parse a TLS Client Hello message, and introduce https://tlshello.agwa.name, an HTTP server that returns a JSON representation of the Client Hello message sent by your client.

Using cryptobyte

The cryptobyte parser is centered around the cryptobyte.String type, which is just a slice of bytes that points to the message that you are parsing:

type String []byte

cryptobyte.String contains methods that read a part of the message and advance the slice to point to the next part.

For example, let's say you have a message consisting of a variable-length string prefixed by a 16-bit big-endian length, followed by a 32-bit big-endian integer:

00 06 'A' 'n' 'd' 'r' 'e' 'w' 00 5A F6 0C
len(name) name id

First, you create a cryptobyte.String variable, message, which points to the above bytes.

Then, to read the name, you use ReadUint16LengthPrefixed:

var name cryptobyte.String message.ReadUint16LengthPrefixed(&name)

ReadUint16LengthPrefixed reads two things. First, it reads the 16-bit length. Second, it reads the number of bytes specified by the length. So, after the above function call, name points to the 6 byte string "Andrew", and message is mutated to point to the remaining 4 bytes containing the ID.

To read the ID, you use ReadUint32:

var id uint32 message.ReadUint32(&id)

After this call, id contains 5961228 (0x5AF60C) and message is empty.

Note that cryptobyte.String's methods return a bool indicating if the read was successful. In real code, you'd want to check the return value and return an error if necessary.

It's also a good idea to call Empty to make sure that the string is really empty at the end, so you can detect and reject trailing garbage.

cryptobyte.String's methods are generally zero-copy. In the above example, name will point to the same memory region which message originally pointed to. This makes cryptobyte very efficient.

Parsing the TLS Client Hello

Let's write a function that takes the bytes of a TLS Client Hello handshake message as input, and returns a struct with info about the TLS handshake:

func UnmarshalClientHello(handshakeBytes []byte) *ClientHelloInfo

We start by constructing a cryptobyte.String from handshakeBytes:

handshakeMessage := cryptobyte.String(handshakeBytes)

For guidance, we turn to Section 4 of RFC 8446, which describes TLS 1.3's handshake protocol.

Here's the definition of a handshake message: struct { HandshakeType msg_type; /* handshake type */ uint24 length; /* remaining bytes in message */ select (Handshake.msg_type) { case client_hello: ClientHello; case server_hello: ServerHello; case end_of_early_data: EndOfEarlyData; case encrypted_extensions: EncryptedExtensions; case certificate_request: CertificateRequest; case certificate: Certificate; case certificate_verify: CertificateVerify; case finished: Finished; case new_session_ticket: NewSessionTicket; case key_update: KeyUpdate; }; } Handshake;

The first field in the message is a HandshakeType, which is an enum defined as:

enum { client_hello(1), server_hello(2), new_session_ticket(4), end_of_early_data(5), encrypted_extensions(8), certificate(11), certificate_request(13), certificate_verify(15), finished(20), key_update(24), message_hash(254), (255) } HandshakeType;

According to the above definition, a Client Hello message has a value of 1. The last entry of the enum specifies the largest possible value of the enum. In TLS, enums are transmitted as a big-endian integer using the smallest number of bytes needed to represent the largest possible enum value. That's 255, so HandshakeType is transmitted as an 8-bit integer. Let's read this integer and verify that it's 1:

var messageType uint8 if !handshakeMessage.ReadUint8(&messageType) || messageType != 1 { return nil }

The second field, length, is a 24-bit integer specifying the number of bytes remaining in the message.

The third and last field depends on the type of handshake message. Since it's a Client Hello message, it has type ClientHello.

Let's read these two fields using ReadUint24LengthPrefixed and then make sure there are no bytes remaining in handshakeMessage:

var clientHello cryptobyte.String if !handshakeMessage.ReadUint24LengthPrefixed(&clientHello) || !handshakeMessage.Empty() { return nil }

clientHello now points to the bytes of the ClientHello structure, which is defined in Section 4.1.2 as follows:

struct { ProtocolVersion legacy_version; Random random; opaque legacy_session_id<0..32>; CipherSuite cipher_suites<2..2^16-2>; opaque legacy_compression_methods<1..2^8-1>; Extension extensions<8..2^16-1>; } ClientHello;

The first field is legacy_version, whose type is defined as a 16-bit integer:

uint16 ProtocolVersion;

To read it, we do:

var legacyVersion uint16 if !clientHello.ReadUint16(&legacyVersion) { return nil }

Next, random, whose type is defined as:

opaque Random[32];

That means it's an opaque sequence of exactly 32 bytes. To read it, we do:

var random []byte if !clientHello.ReadBytes(&random, 32) { return nil }

Next, legacy_session_id. Like random, it is an opaque sequence of bytes, but this time the RFC specifies the length as a range, <0..32>. This syntax means it's a variable-length sequence that's between 0 and 32 bytes long, inclusive. In TLS, the length is transmitted just before the byte sequence as a big-endian integer using the smallest number of bytes necessary to represent the largest possible length. In this case, that's one byte, so we can read legacy_session_id using ReadUint8LengthPrefixed:

var legacySessionID []byte if !clientHello.ReadUint8LengthPrefixed((*cryptobyte.String)(&legacySessionID)) { return nil }

Now we're on to cipher_suites, which is where things start to get interesting. As with legacy_session_id, it's a variable-length sequence, but rather than being a sequence of bytes, it's a sequence of CipherSuites, which is defined as a pair of 8-bit integers:

uint8 CipherSuite[2];

In TLS, the length of the sequence is specified in bytes, rather than number of items. For cipher_suites, the largest possible length is just shy of 2^16, which means a 16-bit integer is used, so we'll use ReadUint16LengthPrefixed to read the cipher_suites field:

var ciphersuitesBytes cryptobyte.String if !clientHello.ReadUint16LengthPrefixed(&ciphersuitesBytes) { return nil }

Now we can iterate to read each item:

for !ciphersuitesBytes.Empty() { var ciphersuite uint16 if !ciphersuitesBytes.ReadUint16(&ciphersuite) { return nil } // do something with ciphersuite, like append to a slice }

Next, legacy_compression_methods, which is similar to legacy_session_id:

var legacyCompressionMethods []uint8 if !clientHello.ReadUint8LengthPrefixed((*cryptobyte.String)(&legacyCompressionMethods)) { return nil }

Finally, we reach the extensions field, which is another variable-length sequence, this time containing the Extension struct, defined as:

struct { ExtensionType extension_type; opaque extension_data<0..2^16-1>; } Extension;

ExtensionType is an enum with maximum value 65535 (i.e. a 16-bit integer).

As with cipher_suites, we read all the bytes in the field into a cryptobyte.String:

var extensionsBytes cryptobyte.String if !clientHello.ReadUint16LengthPrefixed(&extensionsBytes) { return nil }

Since this is the last field, we want to make sure clientHello is now empty:

if !clientHello.Empty() { return nil }

Now we can iterate to read each Extension item:

for !extensionsBytes.Empty() { var extType uint16 if !extensionsBytes.ReadUint16(&extType) { return nil } var extData cryptobyte.String if !extensionsBytes.ReadUint16LengthPrefixed(&extData) { return nil } // Parse extData according to extType }

And that's it! You can see working code, including parsing of several common extensions, in my tlshacks package.

tlshello.agwa.name

To test this out, I wrote an HTTP server that returns a JSON representation of the Client Hello. This is rather handy for checking what ciphers and extensions a client supports. You can check out what your client's Client Hello looks like at https://tlshello.agwa.name.

Making the Client Hello message available to an HTTP handler required some gymnastics, including writing a net.Conn wrapper struct that peeks at the first TLS handshake message and saves it in the struct, and then a ConnContext callback that grabs the saved message out of the wrapper struct and makes it available in the request's context. You can read the code if you're curious.

I'm happy to say that deploying this HTTP server was super easy thanks to snid. This service cannot run behind an HTTP reverse proxy - it has to terminate the TLS connection itself. Without snid, I would have needed to use a dedicated IPv4 address.

Comments

Reader Chris on 2023-09-21 at 10:12:

Thank you so much for this blog post and the related code. It has been instrumental in me learning how to extract data from a specific extension that our clients inject in to the Client Hello message. I also see what you mean by gymnastics in order to place the TeeReader in at the right point to get the bytes available for storing.

My next task is to inject an extension from the client side in order to replicate my client's messages as part of a test bed.

Thanks once again

Reply

Andrew Ayer on 2023-09-21 at 15:23:

Thanks for your comment Chris! I'm very glad my post was useful.

Reply

Post a Comment

Your comment will be public. To contact me privately, email me. Please keep your comment polite, on-topic, and comprehensible. Your comment may be held for moderation before being published.

(Optional; will be published)

(Optional; will not be published)

(Optional; will be published)

  • Blank lines separate paragraphs.
  • Lines starting with > are indented as block quotes.
  • Lines starting with two spaces are reproduced verbatim (good for code).
  • Text surrounded by *asterisks* is italicized.
  • Text surrounded by `back ticks` is monospaced.
  • URLs are turned into links.
  • Use the Preview button to check your formatting.