Login - EU subdomains
Incident Report for PagerDuty
Postmortem

Summary

During the period between 00:00 0UTC and 05:00 UTC, on May 25th, PagerDuty experienced an incident with our web application logins in the EU Service Region. During this time, customers encountered server errors when trying to log in to accounts in the EU service Region. Existing sessions were not affected, and login to US accounts for users located outside of the EU were also not affected. Users located in the EU would have seen certificate errors when attempting any login process, including login through the Android mobile app.

What Happened

We preemptively renewed the SSL certificate for identity.pagerduty.com, as it was due to expire at 00:00 UTC on May 25th. We deployed the renewed certificate to our production infrastructure. At 00:00 UTC on May 25th, the old certificate expired. At approximately 03:30 UTC, we received reports from customers that they were unable to login to their accounts, via the web, in the EU region. Our initial investigation determined that the certificate had not been properly deployed to the EU Service Region. Our team then proceeded to deploy the renewed certificate to the EU Service Region. This was completed successfully at approximately 05:00 UTC, at which point the incident was resolved.

What Are We Doing About This

To prevent future incidents of this nature, we will be adding more specific monitoring for expiring certificates in each PagerDuty service region, as well as additional monitoring for errors in our login flows.

Posted Jun 04, 2023 - 22:28 UTC

Resolved
We have resolved an incident where PagerDuty customers in the EU service region experienced issues with login. The incident is now resolved, and there is no ongoing impact to customers. Please reach out to support@pagerduty.com if you have any concerns.
Posted May 25, 2023 - 04:56 UTC
Monitoring
We have identified the issue affecting user logins in EU region. We are deploying a fix, and we expect systems to continue to improve. We currently expect that full resolution will require approximately 30 mins, and will provide an update within that time.
Posted May 25, 2023 - 04:38 UTC
Update
We continue to investigate an incident where PagerDuty customers in the EU service region are experiencing login issues to the PagerDuty website. Existing sessions and mobile logins should not be effected. We will provide further updates within 20 minutes.
Posted May 25, 2023 - 04:15 UTC
Update
We are continuing to investigate an incident where PagerDuty customers in the EU are experiencing issues with logging in to the PagerDuty website. Existing sessions and mobile logins should not be effected. We will provide further updates within 20 minutes.
Posted May 25, 2023 - 03:56 UTC
Identified
We are investigating an incident where PagerDuty customers in EU service region are experiencing issues with login. Impacted customers may see a 500 error when trying to login to their subdomains. We will provide further updates within 20 minutes.
Posted May 25, 2023 - 03:34 UTC
Investigating
We are investigating a potential issue within PagerDuty. If we confirm an impact, we will update within 15 minutes. If there is no impact this notification will be removed.
Posted May 25, 2023 - 03:29 UTC
This incident affected: Log In and SSO (Log In and SSO (EU)).