Stash Platform unavailable
Resolved
Mar 4, 2026 at 12:14am UTC
Root Cause Analysis (RCA)
- Network Failure: A network disruption isolated the cluster, preventing it from processing requests.
- Detection Failure: The system's monitoring failed to trigger alerts because the Health Check Endpoint remained in a "Green/Healthy" state despite the underlying cluster isolation. This prevented the load balancer from rerouting traffic and notifying on-call engineers.
Remediation & Mitigation Actions
To prevent recurrence and ensure platform stability, the following actions are being implemented:
- Endpoint Calibration: We are refactoring the health check logic to ensure it accurately reflects network reachability.
- Cluster Leader Rotation: We are rotating the current cluster leaders to ensure no lingering state corruption exists and to decouple the affected nodes from the rest of the production environment.
Affected services
Created
Mar 3, 2026 at 10:01pm UTC
Sumary
The Stash Platform (https://stash.st-one.io/) experienced a total service outage from 2026-03-03 07:01 GMT to 2026-03-03 09:14 GMT.
Affected services