parallel programming | The Blog Pros

April 22, 2022

Speeding Up Large Collections Processing in Java

According to The Britannica Dictionary, the term collection designates

a group of interesting or beautiful objects brought together in order to show or study them.

January 3, 2021

Scala Futures: Concurrency Interpreted!

Futures allow us to run values off the main thread and handle values that are running in the background or yet to be executed by mapping them with callbacks.

If you come from a Java background, you might be aware of java.util.concurrent.Future. There are several challenges in using this:

October 30, 2020

Deep Dive Into Join Execution in Apache Spark

Join operations are often used in a typical data analytics flow in order to correlate two data sets. Apache Spark, being a unified analytics engine, has also provided a solid foundation to execute a wide variety of Join scenarios.

At a very high level, Join operates on two input data sets and the operation works by matching each of the data records belonging to one of the input data sets with every other data record belonging to another input data set. On finding a match or a non-match (as per a given condition), the Join operation could either output an individual record, being matched, from either of the two data sets or a Joined record. The joined record basically represents the combination of individual records, being matched, from both the data sets.