Book Review: Designing Data-Intensive Applications (Part 3)

This is part 3 of a three-part review. You can find part 1 here and part 2 here.

10. Batch Processing

This is the first chapter of the part of the book dealing with derived data. There is a distinction between systems of record (holds the authoritative version of the data) and derived data systems. Data in derived data systems is existing data transformed or processed in some way. For example, a cache or a search index.

Book Review: Designing Data-Intensive Applications (Part 2)

This is part 2 of a three-part review. You can find part one here.

5. Replication

This is the first chapter in the Distributed Data section. Replication means that the same data is stored on multiple machines. Some reasons for replication are: to keep working even if some parts of the system fail (increased availability), to keep data geographically close to the users (reduce latency), and to increase read throughput by serving the same data from many machines.