Delays in push notifications
Incident Report for Zulip Cloud
Postmortem

Push notifications in Zulip are done by a background worker, which talks to Apple and Google servers to notify applications on users' mobile devices. On Monday, the rate of push notifications generated by the Zulip Cloud service surpassed the rate which a single worker could send those notifications, leading to a backlog of notifications. This was further worsened by other load on the system, which compounded the backlog, leading to delays of up to 10 minutes between when a push notification was triggered, and when it was sent to users' mobile devices.

We have since split the workers which deliver these notifications, allowing us to process many more in parallel.

Posted Dec 05, 2023 - 20:17 UTC

Resolved
This incident has been resolved.
Posted Dec 04, 2023 - 22:24 UTC
Monitoring
The mobile notifications delay is now down to 2.5 minutes, and we expect to clear that backlog shortly. We will continue to monitor the situation.
Posted Dec 04, 2023 - 18:20 UTC
Update
We are working on a fix for the issue. Notifications are now backlogged by 10 minutes.
Posted Dec 04, 2023 - 16:30 UTC
Identified
We are currently experiencing a backlog in our push notifications service, which is causing up to 5 minute delays in devices receiving push notifications from Zulip Cloud. Push notifications from self-hosted Zulip servers using our push bouncer services are not currently affected.
Posted Dec 04, 2023 - 15:42 UTC
This incident affected: Push notifications.