yarn | The Blog Pros

April 15, 2022

Capacity and Compliance in Hybrid Cloud, Multi-Tenant Big Data Platforms

As organizations are realizing how Data-Driven insights can empower their strategic decisions and increase their ROI, the focus is on building Data Lakes and Data Warehouses where all the Big Data can be safely archived. Big data can then be used to empower various data engineering, data science, business analytics, and operational analytics initiatives to benefit the business by improving operational efficiency, reducing operating costs, and making better strategic business decisions. However, the exponential growth in the data that we humans consume and generate day to day makes it necessary to have a well-structured approach toward capacity governance in the Big Data Platform.

Introduction:

Capacity governance and scalability engineering are inter-related disciplines, as this requires a comprehensive understanding of our compute and storage capacity demands, infrastructure supply, and their inter-dynamics to develop an appropriate strategy for scalability in the big data platform. In addition to this, technical risk resolution and security compliance are equally important aspects of capacity governance.

November 26, 2021

How (and Why) to Move from Spark on YARN to Kubernetes

Apache Spark is among the most usable open-source distributed computing frameworks because it allows data engineers to parallelize the processing of large amounts of data across a cluster of machines.

When it comes to data operations, Spark provides a tremendous advantage as a resource for data operations because it aligns with the things that make data ops valuable. It is optimized for machine learning and AI, which are used for batch processing (in real-time and at scale), and it is adept at operating within different types of environments.

March 15, 2021July 18, 2022

Put Your Leftover Yarn to Good Use with These Crafts

Have you been doing a lot of knitting? Well, chances are you have a ton of leftover yarn, but not enough to knit a full project, lying around your house. Instead of letting it go to waste, try these five yarn craft projects. They’re super easy and they’ll keep you busy next weekend. Yarn Pom-Poms […]

November 4, 2020

Make Your Own T-Shirt Yarn With This Step-by-Step Tutorial

If you haven’t heard of t-shirt yarn before, it’s exactly what it sounds like—yarn made from recycled t-shirts. It’s sustainable and eco-friendly and it’s a fantastic way to...

Visit The Site For More...

January 3, 2020

MapReduce and Yarn Part 2: Hadoop Processing Unit

In my previous article, we learned about MapReduce. In this, we will focus on YARN, which enhances the power of Hadoop. YARN is not a competitor of Mapreduce but a framework to help perform Hadoop better. It's also referred to as Hadoop 2.

Hadoop 1.0 vs Hadoop 2.0

December 26, 2019

MapReduce and Yarn: Hadoop Processing Unit Part 1

In my previous article, HDFS Architecture and Functionality, I’ve described the filesystem of Hadoop. Today, we will be learning about the processing unit of it. There are mainly two mechanisms by which processing takes place in a Hadoop cluster, namely, MapReduce and YARN. In our traditional system, the major focus is on bringing data to the storage unit. In the Hadoop process, the focus is shifted towards bringing the processing power to the data to initiate parallel processing. So, here, we will be going through MapReduce and, in part two, YARN.

Mapreduce

As the name suggests, processing mainly takes place in two steps, mapping and reducing. There is a single master (Job tracker) that controls ob execution on multiple slaves (Task tracker). The Job Tracker accepts MapReduce jobs submitted by the client. It pushes a map and reduce tasks out to Task Tracker and also monitors their status. Task trackers' major function is to run the map and reduce tasks. They also manage and store the intermediate output of the tasks.

April 16, 2019

Node.js and Yarn for Happy Local Package Development

This is not another praise piece for npm package management with Yarn, but rather a concise recipe for working with locally developed packages.

npm modules begin their lives when you init them on your local dev machine, but there comes a point when you want to test them out or simply use them with other Node.js projects you have.