On August 3nd at 20:06 UTC, we suffered a partial disruption of our iOS mobile application for 6 hours and 35 minutes. Push notification delivery was not affected. During this time, the PagerDuty iOS mobile app continually displayed red error banners to all users trying to view the Open Incidents screen when there were no open incidents for that filter (Mine, My Teams, All).
A deployment to our API codebase accidentally introduced a bug. This caused our API to return an unexpected value (null) for one of the response schema properties, pertaining to pagination, when returning an empty list. The iOS mobile app was not able to handle this case, which resulted in an error message to the user on any view where there would be a list of data but for which there was no data to display.
Our monitoring solutions did not automatically detect this issue, and so initial investigation did not begin until 01:24 UTC when PagerDuty engineers became aware of the error. Shortly thereafter, the issue was identified and the necessary change to fix the API bug was deployed. Full functionality was restored at 02:41 UTC.
Our resolution time was delayed by a few different factors. Firstly, the issue was not automatically identified by our mobile apps monitoring tool. Secondly, it took a long time for PagerDuty engineers to identify that customer experience was disrupted, and so there was a long delay before initiating our incident response process. We will be improving the coverage of our automatic monitoring and testing to detect these cases in future. We have also put in place safeguards against making such API changes in the future.
We would like to again apologize for any inconvenience this issue caused. If you have any questions, please do not hesitate to contact us at firstname.lastname@example.org.