During the period between 00:00 0UTC and 05:00 UTC, on May 25th, PagerDuty experienced an incident with our web application logins in the EU Service Region. During this time, customers encountered server errors when trying to log in to accounts in the EU service Region. Existing sessions were not affected, and login to US accounts for users located outside of the EU were also not affected. Users located in the EU would have seen certificate errors when attempting any login process, including login through the Android mobile app.
We preemptively renewed the SSL certificate for identity.pagerduty.com, as it was due to expire at 00:00 UTC on May 25th. We deployed the renewed certificate to our production infrastructure. At 00:00 UTC on May 25th, the old certificate expired. At approximately 03:30 UTC, we received reports from customers that they were unable to login to their accounts, via the web, in the EU region. Our initial investigation determined that the certificate had not been properly deployed to the EU Service Region. Our team then proceeded to deploy the renewed certificate to the EU Service Region. This was completed successfully at approximately 05:00 UTC, at which point the incident was resolved.
To prevent future incidents of this nature, we will be adding more specific monitoring for expiring certificates in each PagerDuty service region, as well as additional monitoring for errors in our login flows.