AWS CloudWatch + yCrash = Monitoring + RCA

AWS Cloud Watch + yCrash = Monitoring + RCAWe had an outage in our online application GCeasy on Monday morning (PST) Oct 11, 2021. When customers uploaded their Garbage Collection logs for analysis, the application was returning an HTTP 504 error. HTTP 504 status code indicates that transactions are timing out. In this post, we would like to document our journey to identify the root cause of the problem.

Application Stack

 Here are the primary components of the technology stack of the application: