Slow journey performance and user API endpoints impacted on Cluster 6
Incident Report for Iterable
Resolved
After a period of monitoring, the C6 cluster has reached a stable state, and this issue has been resolved
Posted Oct 05, 2023 - 15:40 PDT
Monitoring
The team has successfully identified the root cause of the high CPU usage on the c6 cluster, which was attributed to the performance of certain queries. This issue has been addressed by excluding those queries from being used for insights followed by action items to optimize the queries for long-term resolution. As a result, the c6 cluster has regained its health and recovered from the high CPU usage. Monitoring of the cluster will continue for some time to ensure its stability
Posted Oct 05, 2023 - 13:08 PDT
Update
The engineering team is actively investigating the root cause of the high CPU load on c6, which is leading to slow performance in Journey's and User data API calls. Our next update on the progress is at 1:30 PM PST. This slowness is for the customers on C6 cluster.
Posted Oct 05, 2023 - 12:27 PDT
Investigating
Customers on cluster 6 may be experiencing delays with journey performance as well as 5XX errors to any API calls around user endpoints. Our engineering team is working to identify the cause. Emails are still being processed. The next update will be at 12:30PM PST.
Posted Oct 05, 2023 - 11:24 PDT
This incident affected: Cluster 6 (Workflow Processing).