Distributed Saga and Resiliency of Microservices

Sagas are typically used for modeling long-lived transactions like those involved in workflows. It is not advisable to use two-phase transaction protocols to control long-lived transactions since the locking of resources for prolonged durations across trust boundaries is not practical, rather is not at all advisable. Sagas are similar to nested transactions. In a nested transaction, atomic transactions are embedded in other transactions. In sagas, each of these transactions has a corresponding compensating transaction. While a Saga proceeds with its steps, if any of the transactions in a saga fails, the compensating actions for each transaction that was successfully run previously will be invoked so as to nullify the effect of the previously successful transactions.

Modeling a Saga and setting up a Saga Infrastructure is rather straight forward in Local and simple deployments, however when we want to scale out in public cloud environments, and that too with multiple instances of the same type of microservice, there comes a new list of challenges. We will look at few of them in this discussion.