Accumulator and Broadcast Variables in Spark

At a high level, accumulators and broadcast variables both are Spark-shared variables. In distributed computing, understanding closure is very important. Often, it creates confusion among programmers in understanding the scope and life cycle of variables and methods while executing code in a cluster. Most of the time, you will end up getting :

org.apache.spark.SparkException: Job aborted due to stage failure: 
Task not serializable: java.io.NotSerializableException: ...

Accumulator 101

Motivation

GSQL is a Turing complete Graph Database query language. Compared to other graph query languages, the biggest advantage is its support of Accumulators — global or attachable to each vertex.

In addition to providing the classic pattern match syntax, which is easy to master, GSQL supports powerful run-time vertex attributes (a.k.a local accumulators) and global state variables (a.k.a global accumulators). I have seen users learning and adopting pattern match syntax within ten minutes. However, I also witnessed the uneasiness of learning and adopting accumulators for beginners.