25th of April - 15:00 UTC
The problem has been identified and mitigated. Our network is now stable, and services have resumed normal operation.
We apologize for any inconvenience and are taking steps to prevent future occurrences.
We are working on an RCA report which we aim to share mid next week.
If you have any questions, please reach out to support@redwood.com
24th of April - 23:30 UTC
RMJ/RMF environmental and Cloud Portal environment management console stabilization has been mitigated. Redwood will investigate the root cause and seek to provide a customer-facing summary of the many parallel efforts that were actioned on and the actions that will be taken to mitigate in the future.
24th of April - 15:30 UTC
We are planning to isolate and stabilize the Cloud Portal tomorrow (UK time). We will finalize our testing today to ensure a smooth transition for our customers with minimal business disruption.
24th of April - 10:00 UTC
At the moment we are investigating stabilization of the Cloud Portal as a critical global service across all products. We are working on a plan to migrate the Cloud Portal into a separate infrastructure. We are working diligently today to gain the highest confidence level to make this transition with minimal impact to customers.
23rd of April - 21:30 UTC
We have established a holding pattern for the night (UK). In the morning (UK). Our first steps will be to isolate the portal infrastructure in order to remove the portal from the transient instability within the Dublin cluster. We will evaluate the stability of the cluster after migrating the test portal and plan for a clean transition for the prod portal. Further updates as we progress and gain more data and metrics.
23rd of April - 19:00 UTC
We are shutting down university environments in Dublin. This is to reduce the load on network traffic that is struggling due to the issue we are seeing. These are the only environments we currently plan on shutting down. All environments are still stable. We are just proactively reducing environments and networks where we can.
23rd of April - 17:00 UTC
Environments are still stable within Dublin with processes operating as normal.
We are still experiencing network complications, including cloud-portal monitoring, which occasionally shows environments as down when they are up.
We will post another update at 19:00 UTC at the latest.
23rd of April - 15:00 UTC
Environments are still stable within Dublin with processes operating as normal.
We still are experiencing network complications including the cloud-portal monitoring is occasionally showing environments as down when they are up.
We will post another update at 17:00 UTC at the latest.
23rd of April - 14:00 UTC
Environments are still stable within Dublin with processes operating as normal.
We still are experiencing network complications including the cloud-portal monitoring is occasionally showing environments as down when they are up.
We will post another update at 15:00 UTC at the latest.
23rd of April - 13:00 UTC
Environments are stable within Dublin with processes operating as normal.
We are experiencing network complications including the cloud-portal monitoring is occasionally showing environments as down when they are up.
We will post another update at 13:00 UTC at the latest.
23rd of April - 12:00 UTC
We are reviewing whether any recently updated monitoring tools could be the cause but need to test before we bring them down. Specifically we are investigating a cloud watch monitoring container.
We will post another update at 13:00 UTC at the latest.
23rd of April - 11:00 UTC
We are still working on the issue and monitoring the network closely. Unfortunately we still see intermittent connection issues.
We will post another update at 12:00 UTC at the latest.
23rd of April - 10:00 UTC
We continue to investigate the networking issue in the Dublin region which is causing the intermittent connection issues. Although environments are sometimes shown as red in the dashboard, they are not actually down but the status check to the environment intermittently fails.
We will post another update at 11:00 UTC at the latest.
23rd of April - 09:00 UTC
We are investigating a potential gradual overflow of networking in the Dublin region and taking steps to reduce load on the network.
We will post another update at 10:00 UTC at the latest.
23rd of April - 08:00 UTC
Our Cloud infrastructure team is still troubleshooting the issue, working to stabilize connectivity. We will post another update at 09:00 UTC at the latest.
23rd of April - 07:20 UTC
Dear customer,
We are currently working on an issue that is causing intermittent connection issues to the Cloud Dashboard and/or your environments.
A next update to the KBA will be posted before 08:00 UTC at the latest.
Kind regards,
Redwood Customer Support.
Comments
16 comments
KBA updated
KBA updated
KBA updated
KBA updated
KBA updated
KBA updated
KBA updated
KBA updated
KBA updated
KBA updated
Hello RW team, our Prod environment access is still very unstable (Bad Gateway 502 issue).
Any fresh updates from your side here?
KBA updated
KBA updated
@Alexandra Gruneberg
Is there any update about when this will be resolved?
Is there going to be another maintenance similar to last weekend and what day and time this will occur?
KBA Updated
KBA updated.
Please sign in to leave a comment.