Workflow delays for some customers
Incident Report for Iterable
Resolved
This incident has been resolved.
Posted Jul 09, 2019 - 16:03 PDT
Update
We have not encountered any further issues with workflow delays, so we are now closing this incident. Thank you for your patience if you were adversely affected, and if there are any questions or concerns, please open a ticket by emailing support@iterable.com
Posted Jul 09, 2019 - 10:14 PDT
Monitoring
Workflow queues have now cleared and we are back to operating as normal. We will continue to monitor this through the night. Further actions will be taken tomorrow to continue addressing the root cause of this. Next update by 8:30am Pacific Time, unless more information is identified prior.
Posted Jul 08, 2019 - 19:37 PDT
Update
Workflow queues for impacted customers continue to improve, reducing by about 35% in the last hour. We are continuing to work on making adjustments to improve throughput. Next update by 7:30pm Pacific Time.
Posted Jul 08, 2019 - 17:28 PDT
Update
We are down to about ten customers still seeing workflow backups. Those queues are recovering and the engineering team is continuing to work on options for speeding up that recovery. Next update by 5pm Pacific Time.
Posted Jul 08, 2019 - 15:32 PDT
Update
Most workflow queues are now processing successfully, although there is still some residual backup. We're continuing to work on rectifying a root cause of some of the backup, to ensure the situation does not recur. Next update by 3:30pm Pacific Time.
Posted Jul 08, 2019 - 14:00 PDT
Identified
We've made some good progress in the last 30 minutes, hence the slight delay in updates. The workflow queue backups are continuing to process, although there are a lot of events to get through. We have identified the cause of the workflows that were delayed for other reasons (as mentioned in the previous update) and are working on deploying new code to fix that root cause. Next update by 1:45pm Pacific Time.
Posted Jul 08, 2019 - 12:19 PDT
Update
The team is making progress on identifying root causes for the delayed workflows. Most workflow delays are due to yesterday's ingestion issues and the system is making progress on getting through that backlog. There are a few workflows that are delayed for other reasons, and it is those situations which require additional investigation, which is ongoing. Next update by 12pm Pacific Time.
Posted Jul 08, 2019 - 10:34 PDT
Investigating
We're seeing some lingering issues from yesterday's data ingestion delays that are causing workflow processing delays for some customers. The engineering team is working on resolving the backups right now. We'll post another update by 10:30am Pacific Time.
Posted Jul 08, 2019 - 09:12 PDT