Skip to Content [alt-c]

December 3, 2019

Programmatically Accessing Your Customers' Google Cloud Accounts (While Avoiding the Confused Deputy Problem)

SaaS applications often need to access their customers' cloud resources at providers like Amazon Web Services and Google Cloud Platform. For instance, a monitoring service might require read-only access to their customers' AWS accounts so it can inventory resources. At SSLMate, we request access to our customers' DNS zones so we can publish DNS records to automatically validate the certificates that they request.

Doing this with AWS was easy, thanks to their detailed documentation for precisely this use case. However, when it came to Google Cloud, the only hint of a solution that I could find was buried deep in this FAQ:

How can I access data from my users' Google Cloud Platform project using Cloud APIs?

You can access data from your users' Google Cloud Platform projects by creating a service account to represent your service, and then having your customers grant that service account appropriate access to their cloud data using IAM policies. Note that you might want to create a service account per customer if you need to avoid confused deputy problems.

The first part sounds pretty easy: create a service account which SSLMate uses when making DNS changes and ask customers to grant this account access to Cloud DNS in their Google Cloud project. What isn't as easy, unfortunately, is solving the confused deputy problem. Contrary to the FAQ, this isn't something that we "might" want to do - it's something we absolutely must do. If SSLMate used just a single service account and all of our customers authorized it, then a malicious customer would be able to request a certificate for any of our customer's domains. SSLMate would access the victim's project using the single SSLMate service account and publish the validation record for the attacker's certificate request. This would succeed since the victim had authorized the single SSLMate service account. I can't think of any application that would not also be vulnerable if it did not address the confused deputy problem.

AWS provides a nice solution to the problem. The customer doesn't just authorize SSLMate's AWS account; they authorize the AWS account plus an "external ID", which for SSLMate is the same as their SSLMate customer ID. When SSLMate connects to AWS to add a DNS record, it sends the ID of the customer on whose behalf it is acting. If it doesn't match the authorized external ID, AWS blocks the request.

But with Google Cloud we're stuck creating a different service account for each SSLMate customer. The customer would authorize only that service account, and SSLMate would use it to access their Google Cloud project. Creating all these service accounts isn't hard - there's an API for that - but what's hard is giving each of these service accounts their own key. I'd have to securely store all those keys somewhere, and rotate them periodically, which is a pain.

Fortunately, Google Cloud has an API to generate short-lived OAuth2 access tokens for a service account. SSLMate invokes this API from a single master service account to get an access token for the customer-specific service account, and uses the access token to make the DNS change requests. When it's done, SSLMate discards the credentials. The customer-specific accounts have no long-term keys associated with them; only the single master service account does, making key management significantly easier, and equivalent to AWS.

Here's how you can set this up for your SaaS application...

1. Create the Master Service Account

This service account will be used by your app to create customer-specific service accounts, and to create short-lived access credentials for accessing those accounts.

  1. Create the service account, giving it a name of your choosing.

  2. Assign the service account the following roles:

    • Service Account Admin - this is needed to create customer-specific service accounts
    • Service Account Token Creator - this is needed to get the short-lived access credentials
  3. Create a key for this service account and save a copy. This key will be used by your app.

2. Onboarding a Customer

Your app needs to do this every time you onboard a new customer (or an existing customer wants to integrate their Google Cloud account with your service).

  1. Create a service account for the customer using the IAM API, authenticating with the key for your master service account. I suggest deriving the service account ID from the customer's internal ID. For instance, if the customer ID is 1234, use customer-1234 as the service account ID.

  2. Instruct your customer to authorize this service account as follows:

    1. Visit the IAM page for their project.
    2. Click Add.
    3. In the New Member box, your customer must enter the email address of the service account created in step 1. The email address will look like: SERVICE_ACCOUNT_ID@YOUR_PROJECT_ID.iam.gserviceaccount.com
    4. In the Role box, choose the roles necessary for your app to function (in SSLMate's case, this is "DNS Administrator").
    5. Click Save.
  3. After your customer authorizes the account, I recommend testing the access by making a simple read-only request as described in part 3 below and displaying an error if it doesn't work.

3. Accessing Your Customer's Account

  1. Use the generateAccessToken API to create an access token for the customer-specific service account. Authenticate to this API using the master service account's key. The name parameter must contain the email address of the customer-specific service account, which you can derive from the customer ID if you follow the naming convention suggested above. The scope array must contain the OAuth scopes you need to access. (In SSLMate's case, this is https://www.googleapis.com/auth/ndev.clouddns.readwrite)

  2. Use the token returned by the API call as a Bearer token for accessing your customer's account.

If you're using Go, it's helpful to put the above logic in a type that implements the oauth2.TokenSource interface, which you can pass to oauth2.NewClient to create the http.Client that you use with the Google Cloud API library. Here's a drop-in token source type you can use, and here's an adaptable example of how to use it.

Conclusion

While it's more complicated to set up than AWS, in the end this solution has the same desirable properties as AWS: protection against confused deputy attacks, and just one long term credential for your application. If only Google Cloud's documentation for this was as helpful as AWS's!

Addendum: Why I Didn't Use Three-Legged OAuth

My first attempt at implementing this feature used three-legged OAuth. SSLMate would redirect the customer to Google, they'd click a single button to authorize SSLMate for read/write access to Google Cloud DNS, and then Google would redirect back to SSLMate with an access code. This worked well and provided a great user experience since the customer didn't need to do any configuration. Unfortunately, the access was linked to the customer's Google account, rather than their Google Cloud project. I thought this was a bad idea because if the Google Account ever lost access to the project, it would break the SSLMate integration. The integration needs to continue working even if the employee who set it up leaves the company which owns the Google Cloud project.

Another thing which turned me off three-legged OAuth was that Google wanted to subject my integration to a lengthy review because accessing DNS is considered a "sensitive scope." This sounded like a hassle, and I assume it is only going to get more restrictive in the future as Google tries to stop malicious OAuth apps like last year's viral Google Docs attack. For example, if they decided one day to classify DNS access as a "restricted scope," I would be subjected to a 5 figure security audit.

Consequentially, I ditched three-legged OAuth and turned to the solution described above. The user experience is not quite as nice but it's much more robust. SSLMate still uses three-legged OAuth with other DNS providers which support it, like DNSimple and Digital Ocean.

Comments

April 15, 2019

MTA-STS is Hard. Here's how DNS Providers Can Make it Awesome With Automation...

Last week, Gmail became the first major email provider to enable the new MTA-STS standard, which will prevent attackers from intercepting email sent to and from Gmail. If you operate a domain which receives email, you should be looking into enabling MTA-STS too, even if you've out-sourced operation of the actual mail servers to a third party provider.

Unfortunately, MTA-STS introduces several new moving parts to operating a domain which I anticipate will cause operational problems. However, a smart DNS provider can offer automation that makes MTA-STS as easy for domain owners as checking a box. Let me explain how...

First, some background. MTA-STS can be distilled into two parts:

  1. Your domain's mail servers need to support modern TLS (TLS 1.2 or higher) and present a publicly-trusted certificate that's valid for the MX server hostname (that is, the hostname which you put in the MX record - not the domain name which receives email). Since this part is straightforward and is taken care of by your mail server provider (who might even be in compliance already), I will not be focusing on it in this post.

  2. You need to duplicate the contents of your domain's MX records in a text file which you serve over HTTPS at https://mta-sts.YOURDOMAIN/.well-known/mta-sts.txt. The reason for the duplication is that DNS is not authenticated, but HTTPS is. By requiring the publication of your MX records at an HTTPS URL which is derived from your domain name, MTS-STS prevents an attacker from swapping out your MX servers with their own. (DNSSEC would accomplish the same thing, but DNSSEC hasn't worked out very well in practice, which is why MTA-STS exists.)

    In addition, you need to publish a TXT record at _mta-sts.YOURDOMAIN indicating that your domain uses MTA-STS. Any time you change your mta-sts.txt file, you need to update the ID value in this TXT record. The reason for the TXT record is that retrieving a web page over HTTPS is expensive compared to a DNS lookup. The TXT record allows mail servers to skip retrieving mta-sts.txt if a domain doesn't use MTA-STS or if mta-sts.txt hasn't changed since the last retrieval.

The consequence of the above is that any time you update your domain's mail servers, you have to make a change in three places: the MX record itself (as you do now), the mta-sts.txt file on your web server, and finally the _mta-sts TXT record. If you forget one of the updates or mess one up, you risk losing mail.

Unfortunately, my experience indicates that humans are quite bad at remembering this type of thing. Two common failures which I've seen are forgetting to keep a domain's NS records in sync with the NS records in the parent zone, and forgetting to update a zone's SOA serial number after making changes. Therefore, I anticipate DNS administrators forgetting to keep their MX records in sync with their mta-sts.txt file, and forgetting to update the ID in the _mta-sts TXT record. Ironically, one of the motivations for MTA-STS is that DNSSEC is too difficult to deploy correctly, but I think it's premature to say that MTA-STS will cause fewer outages.

Generally, the way to cope with error-prone, repetitive tasks like this is to automate them away, and DNS providers are in a perfect position to automate MTA-STS policy maintenance for their customers.

DNS providers should offer a checkbox to enable MTA-STS on any domain with an MX record. If you check this box, the DNS provider should do the following:

  1. Automatically publish an A/AAAA/CNAME record at mta-sts.YOURDOMAIN that points to a web server operated by the DNS provider. The DNS provider should automatically obtain a certificate for this hostname. Their web server should respond to requests for /.well-known/mta-sts.txt by consulting YOURDOMAIN's MX records and dynamically generating an mta-sts.txt file containing each MX server. (Rather than looking up the MX records over the open Internet, which would be insecure, they should directly consult the data source for the domain's records, which they can do since they are the DNS provider.)

  2. Automatically publish a TXT record at _mta-sts.YOURDOMAIN containing an automatically-generated ID value. According the MTA-STS spec, the ID needs to “uniquely identify” a particular instance of the mta-sts.txt file. The ID can be up to 32 alphanumeric characters, so the easiest way to automatically generate a unique ID is to generate the mta-sts.txt file as described in the previous paragraph, hash it with SHA-256, and take the first 32 characters of the hex-encoded hash.

I hope DNS providers implement this. If domain owners have to manage MTA-STS manually, I anticipate a lot of “human error” that could hamper MTA-STS' adoption. I put “human error” in quotes because although it is a very common expression, it often indicates a problem not with the human who made the error, but with software failing to automate tasks that computers can do easily. Let's make sure MTA-STS can't have human error!

Comments

April 13, 2018

Making Certificates Easier and Helping the Ecosystem: Four Years of SSLMate

I'm not actually sure when SSLMate was born. I got the idea, registered the domain name, and wrote the first lines of code in August 2013, but I put it on the backburner until March 2014. I think I "launched" in early April, but since I thought of SSLMate as a side project mainly for my own use, I didn't do anything special.

I do know that I sold my first certificate on April 13, 2014, four years ago to this day. I sold it to a friend who needed to replace his certificates after Heartbleed and was fed up with how hard his certificate authority was making it. It turns out a lot of people were generally fed up with how hard certificate authorities made things, and in the last four years, SSLMate has exceeded my wildest expectations and become my full-time job.

Sadly, certificate resellers have a deservedly bad reputation, which has only gotten worse in recent months thanks to the likes of Trustico and other resellers with terrible security practices. But it's not entirely a hellscape out there, and for SSLMate's birthday, I thought it would be nice to celebrate the ways SSLMate has been completely unlike a typical certificate reseller, and how SSLMate has allowed me to pursue work over the last four years that lifts up the Web PKI as a whole.

SSLMate was different from the beginning. When SSLMate launched in 2014, it was the first, and only, way you could get a publicly-trusted SSL certificate entirely from the command line. SSLMate made automated certificate issuance accessible to anyone, not just large customers of certificate authorities. More importantly, it significantly improved the usability of certificate issuance at a time when getting a certificate meant running long OpenSSL commands, copy-and-pasting PEM blobs to and from websites, and manually extracting a Zip file and assembling the correct certificate chain. The state of the art in usability was resellers generating, and possibly storing, your private key on their servers. SSLMate showed that certificate issuance could be easy without having to resort to insecure practices.

In September 2014, SSLMate stopped selling multi-year certificates. This was unusual at a time when certificates could be valid for up to five years. The change was unpopular among a handful of customers, but it was unquestionably the right thing to do. Shorter-lived certificates are better for the ecosystem, since they allow security standards to advance more quickly, but they are also better for customers. I learned from the SHA-1 deprecation that a multi-year certificate might not remain valid for its entire term, and it felt wrong to sell a product that I might not be able to deliver on. Sure enough, the five year Symantec certificates that SSLMate was reselling at the time will all become prematurely invalid as of the next Chrome release. (The affected customers all got free replacements, but it's still not what they were expecting.)

The industry is following SSLMate's lead: the CA/Browser Forum limited certificate lifetimes to three years beginning in 2015, and further limited lifetimes to two years beginning last month.

In April 2015, SSLMate released its first public REST API. (While we already had an API for use by the command-line tool, it wasn't previously documented.) As far as I know, this was the world's first fully self-service API for the automated issuance of publicly-trusted SSL certificates. Although major certificate authorities, and some resellers, had APIs (indeed, SSLMate used them under-the-hood), every one of the many APIs I looked at required you to ask a human to enable API access for your account. Some even required you to get on the phone to negotiate prices. With SSLMate's API, you could sign up yourself and immediately start issuing publicly-trusted certificates.

In June, SSLMate published a blog post explaining how to set up OCSP Stapling in Apache and nginx. Resources at the time were pretty bad, so I had to dive into the source code for Apache and nginx to learn how stapling really worked. I was rather horrified at what I saw. The worst bug was that nginx would sometimes staple an expired OCSP response, which would cause Firefox to reject the certificate. So, I submitted a patch fixing it.

In August, several folks asked me to review the ACME specification being worked on at the IETF and provide feedback based on my experience with automated certificate issuance APIs. While reading the draft, I was very bothered by the fact that RSA and ECDSA signatures were being used without any associated message. I had never heard of duplicate signature key selection attacks, but I knew that crypto wasn't being used properly, and when crypto isn't used properly, bad things tend to happen. So I dusted off my undergraduate number theory textbook and came up with an attack that broke ACME, allowing attackers to get unauthorized certificates. After my disclosure, ACME was fixed, before it was deployed in the Web PKI.

In March 2016, it occurred to me that signing OCSP responses with a weak hash function such as SHA-1 could probably lead to the forgery of a trusted certificate. It was already known that signing a certificate with a weak hash function could lead to the forgery of another certificate using a chosen-prefix collision attack, so CAs were forbidden from signing certificates using SHA-1. However, no one had demonstrated a collision attack against OCSP responses, and CAs were allowed to sign OCSP responses with SHA-1.

I figured out how to execute a chosen-prefix attack against OCSP responses, rented a GPU instance in EC2 to make a proof-of-concept with MD5, and scanned every OCSP responder I could find to see which ones could be used to forge a certificate with a SHA-1 collision attack. I reported my findings to the mozilla.dev.security.policy mailing list. This led to a change in Mozilla's Root Store Policy to forbid CAs from signing OCSP responses with SHA-1 except under safe conditions.

Since then, I've periodically scanned OCSP responders to ensure they remain in compliance.

In July 2016, SSLMate launched Cert Spotter, a Certificate Transparency monitor. The core of Cert Spotter is open source, because I wanted non-profits to be able to easily use Certificate Transparency without depending on a commercial service. I'm proud to say that the Wikimedia Foundation uses the open source Cert Spotter to watch for unauthorized certificates for wikipedia.org and their other domains.

Certificate Transparency was designed to be verifiable, but this only matters if a diverse set of people bother to actually do the verification. Cert Spotter has always verified log behavior, and it has detected log misbehavior that was missed by other monitors.

In March 2017, SSLMate started operating the world's second Certificate Transparency gossip endpoint (Graham Edgecombe gets credit for the first) to provide further resiliency to the Certificate Transparency ecosystem. SSLMate also released ct-honeybee, a lightweight program that queries each Certificate Transparency log for its current state and uploads it to Graham's and SSLMate's gossip endpoints. People are now running ct-honeybee on devices all around the world, helping ensure that logs do not present different views to different parts of the Internet.

In 2017, I attended both Certificate Transparency Policy Days hosted by Google to help hash out policy for the burgeoning Certificate Transparency ecosystem.

In September, to help with the upcoming CAA enforcement deadline, I released a free CAA Test Suite for CAs to use to test their implementations.

What's next for SSLMate? The biggest change over the last four years is that the price of certificates as individual goods has gone to zero. But SSLMate has never really been about selling certificates, but about selling easy-to-use software, good support, and a service for managing certificates. And I still see a lot of work to be done to make certificates even easier to work with, particularly with all the new ways certificates are going to be used in the future. I'm pleased to be kicking off SSLMate's fifth year with the release of SSLMate for SaaS, a new service that provides an easy, high-level way for SaaS companies to get certificates for the customer domains they host. This is the first of many exciting announcements in store for this year.

Comments

March 29, 2018

These Three Companies Are Doing the Internet a Solid By Running Certificate Transparency Logs

When we use the Internet, we rely on the security of the certificate authority system to ensure we are talking with the right people. Unfortunately, the certificate authority system is a bit of a mess. One of the ways we're trying to clean up the mess is Certificate Transparency, an effort to put all SSL certificates issued by public certificate authorities in public, verifiable, append-only logs. Domain owners can monitor the logs for unauthorized certificates, and web browsers can monitor for compliance with the rules and take action against non-compliant certificate authorities. After ramping up for the last four years, Certificate Transparency is about to enter prime time: Google Chrome is requiring that all certificates issued on or after April 30, 2018 be logged.

But who is supposed to run these Certificate Transparency logs? Servers, electricity, bandwidth, and system administrators cost money. Although Google is spearheading Certificate Transparency and operates nine logs that are recognized by Chrome, Certificate Transparency is supposed to benefit everyone and it would be unhealthy for the Internet if Google ran all the logs. For this reason, Chrome requires that certificates be included in at least one log operated by an organization besides Google.

So far, three organizations have stepped up and are operating Certificate Transparency logs that are recognized by Chrome and are open to certificates from any public certificate authority:

DigiCert was the first non-Google organization to set up a log, and they now operate several logs recognized by Chrome. Their DigiCert 2 log accepts certificates from all public certificate authorities. They are also applying for recognition of their Nessie and Yeti log sets, which accept certificates from all public certificate authorities and are each split into five shards based on the expiration year of the certificate. (They also operate DigiCert 1, which only accepts certificates from some certificate authorities, and have three logs acquired from Symantec which they are shutting down later this year.)

DigiCert is notable because they've written their own Certificate Transparency log implementation instead of using an open source one. This is helpful because it adds diversity to the ecosystem, which ensures that a bug in one implementation won't take out all logs.

Comodo Certification Authority (which is thankfully no longer owned by the blowhard who thinks he invented 90 day certificates) operates two logs recognized by Chrome: Mammoth and Sabre. Both logs accept certificates from all public certificate authorities, and run SuperDuper, which is Google's original open source log implementation.

In addition to operating two open logs, Comodo CA runs crt.sh, a search engine for certificates found in Certificate Transparency logs. crt.sh has been an invaluable resource for the community when investigating misbehavior by certificate authorities.

Cloudflare is the latest log operator to join the ecosystem. They operate the Nimbus log set, which accepts certificates from all public certificate authorities and is split into four shards based on the expiration year of the certificate. Nimbus runs Trillian, Google's latest open source implementation, with some Cloudflare-specific patches.

Cloudflare is unique because unlike DigiCert and Comodo CA, they are not a certificate authority. DigiCert and Comodo have an obvious motivation to run logs: they need somewhere to log their certificates so they will be trusted by Chrome. Cloudflare doesn't have such a need, but they've chosen to run logs anyways.

DigiCert, Comodo CA, and Cloudflare should be lauded for running open Certificate Transparency logs. None of them have to do this. Even DigiCert and Comodo could have adopted the strategy of their competitors and waited for someone else to run a log that would accept their certificates. Their willingness to run logs shows that they are invested in improving the Internet for everyone's benefit.

We need more companies to step up and join these three in running public Certificate Transparency logs. How about some major tech companies? Although we all benefit from the success of Certificate Transparency, large tech companies benefit even more: they are bigger targets than the rest of us, and they have more to gain when the public feels secure conducting business online. Major tech companies are also uniquely positioned to help, since they already run large-scale Internet infrastructure which could be used to host Certificate Transparency logs. And what kind of tech company doesn't want the cred that comes from helping the Internet out?

If you're a big tech company that knows how to run large-scale infrastructure, why aren't you running a Certificate Transparency log too?

Comments

January 21, 2018

Google's Certificate Revocation Server Is Down - What Does It Mean?

Earlier today, someone reported to the mozilla.dev.security.policy mailing list that they were unable to access any Google websites over HTTPS because Google's OCSP responder was down. David E. Ross says the problem started two days ago, and several Tweets confirm this. Google has since acknowledged the report. As of publication time, the responder is still down for me, though Ross reports it's back up. (Update: a fix is being rolled out.)

What's an OCSP Responder?

OCSP, which stands for Online Certificate Status Protocol, is the system used by SSL/TLS clients (such as web browsers) to determine if an SSL/TLS certificate is revoked or not. When an OCSP-using TLS client connects to a TLS server such as https://www.google.com, it sends a query to the OCSP responder URL listed in the TLS server's certificate to see if the certificate is revoked. If the OCSP responder replies that it is, the TLS client aborts the connection. OCSP responders are operated by the certificate authority which issued the certificate. Google has its own publicly-trusted certificate authority (Google Internet Authority G2) which issues certificates for Google websites.

I thought Chrome didn't support OCSP?

You're correct. Chrome famously does not use OCSP. Chrome users connecting to Google websites are blissfully unaware that Google's OCSP responder is down.

But other TLS clients do use OCSP, and as a publicly-trusted certificate authority, Google is required by the Baseline Requirements to operate an OCSP responder. However, certificate authorities have historically done a bad job operating reliable OCSP responders, and firewalls often get in the way of OCSP queries. Consequentially, although web browsers like Edge, Safari, and Firefox do contact OCSP responders, they use "soft fail" and allow a connection if they don't get a well-formed response from the responder. Otherwise, they'd constantly reject connections that they shouldn't. (This renders OCSP almost entirely pointless from a security perspective, since an attacker with a revoked certificate can usually just block the OCSP response and the browser will accept the revoked certificate.) Therefore, the practical impact from Google's OCSP responder outage is probably very small. Nearly all clients are going to completely ignore the fact that Google's OCSP responder is down.

That said, Google operates some of the most heavily trafficked sites on the Internet. A wide variety of devices connect to Google servers (Google's FAQ for Certificate Changes mention set-top boxes, gaming consoles, printers, and even cameras). Inevitably, at least some of these devices are going to use "hard fail" and reject connections when an OCSP responder is down. There are also people, like the mozilla.dev.security.policy poster, who configure their web browser to use hard fail. Without a doubt, there are people who are noticing problems right now.

I thought Google had Site Reliability Engineers?

Indeed they do, which is why this incident is noteworthy. As I mentioned, certificate authorities tend to do a poor job operating OCSP responders. But most certificate authorities run off-the-shelf software and employ no software engineers. Some regional European certificate authorities even complain when you report security incidents to them during their months-long summer vacations. So no one is surprised when those certificate authorities have OCSP responder outages. Google, on the other hand, sets higher expectations.

Comments

Older Posts Newer Posts