S2 - We are investigating an issue with Eptura Workplace SSO Timeouts
Incident Report for Eptura Workplace
Postmortem

Eptura Workplace detailed Root Cause Analysis | 05/08/2024 

S2 - Eptura Workplace SSO Timeouts 

 

We are truly grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident. 

 

Description: 

When accessing Eptura Workplace, both internal and external users were receiving the following error 502 A timeout occurred on federation.api.iofficeconnect.com

 

Type of Event: 

Outage for Customers who use Single Sign On (SSO) 

 

Services/Modules impacted: 

Production/ SSO 

 

Timeline (Reported in MST):  

11:30am – Multiple customers reported the inability to access Eptura Workplace.  

11:59pm – After initial investigation, the support team escalates a ticket for CloudOps for further troubleshooting. All customers were made aware of the S2 incident via Status Page.  

12:16pm – The CloudOps team identifies the issue and begins working on a resolution.  

12:59pm – The fix was released to production. The Status Page was updated from Investigating to Monitoring.  

1:59pm – While monitoring, no additional reports were received, and customers began to confirm the fix.   

 

Total Duration of Event: 

2hrs 29mins 

 

Root Cause:  

The Mesos Marathon experienced a temporary downtime during the recent restart of the NGINX server. 

Remediation: 

Mesos Marathon was restored and promptly restarted the NGINX server, once the recovery was complete. 

Preventative Action:  

The proxy settings previously pointing to the Mesos Marathon service, which has been decommissioned, have been successfully updated. This change has already been implemented to ensure continued service efficiency.

Posted May 17, 2024 - 15:39 UTC

Resolved
As we have not seen further service disruptions after the fix was implemented, we have moved to the Resolved Phase.
A Preliminary RCA will be posted in this incident in 2 business days. Please stay subscribed to the page to receive post automatically.
Posted May 08, 2024 - 19:59 UTC
Monitoring
A fix has been implemented. We are moving into the Monitoring Phase for the next 60 minutes.
Posted May 08, 2024 - 19:00 UTC
Update
We are continuing to investigate this issue.
Posted May 08, 2024 - 18:03 UTC
Investigating
We are currently investigating an issue with iOffice. We will update you when we have more information.
Posted May 08, 2024 - 18:00 UTC
This incident affected: System Status.