Drilling Into Big Data — Data Preparation

We have already introduced you to Hadoop and clusters in our previous article. At this point, you may be wondering why we need a multi-node cluster. Since most of the services will be running on the master, why don't we just create a single-node cluster instead?

There are two main reasons for using a multi-node cluster. First, the amount of data to be stored and processed can be too large for a single-node to handle. Second, the computational power of a single-node cluster can be limited.