Apache Spark for the Impatient

Below is a list of the most important topics in Spark that everyone who does not have the time to go through an entire book but wants to discover the amazing power of this distributed computing framework should definitely go through before starting.

Architecture

Spark Architecture Diagram

How to Unit Test Classes Which Create New Objects

Learn how to conduct effective unit tests.

First of all, I will start with a disclaimer that I am a strong proponent of using the simple factory programming idiom and by extension of using the Factory Method Pattern, instead of creating objects inside classes. The factory idiom helps you to insulate your code to changes thereby adhering to the Open to Extension Close to modification principle of object-oriented programming.

You may also like: Unit Testing: The Good, Bad, and Ugly

Also, the idea is to write testable code, not to write a hack to test code.

Git Branching: Don’t (Always) Follow the Best Practices

A debatable headline, I agree, but I will explain what it means and I hope that by the end of this short article, you will be equipped with enough parameters to chose your branching and versioning strategy and not make your decisions based purely on best practices.

Gitflow has become the de facto branching strategy for most, if not all, modern companies, and there is not a shred of doubt that it works wonders if properly adhered to. Gitflow is an exhaustive model that encompasses the branching needs of products following varied software development lifecycles, be it a bi-weekly release cycle or a half-yearly release cycle, and that is where my concern lies — it covers everything, and in my opinion, this 'one size fits all' branching model does not work in every situation. It can create process barriers and actually slow down the team.