How to Not Leap in Time Using Python

If you want to display the time to a user of your application, you query the time of day. However, if your application needs to measure elapsed time, you need a timer that will give the right answer even if the user changes the time on the system clock.

The system clock, which tells the time of day, is referred to as a real-time clock or a wall clock. The time on such a clock will jump when changed. Relying on the wall clock to find out how much time has passed since a previous event is a bug waiting to happen.

Big Data File Formats Explained

Apache Spark supports many different data formats, such as the ubiquitous CSV format and web-friendly JSON format. Common formats used primarily for big data analytical purposes are Apache Parquet and Apache Avro.

In this post, we’re going to cover the properties of these four formats — CSV, JSON, Parquet, and Avro with Apache Spark.