Making Your Data Flow Resiliently With Apache NiFi Clustering

Introduction 

In a previous article, we covered the need to take into account a number of factors relevant to both the infrastructure and application when evaluating the placement and performance of the workload within an edge computing environment. These data points included standard measurements around network bandwidth, CPU and RAM utilization, disk I/O performance, as well as other more transient items, such as adjacent services and resource availability. 

Each of these data points is critical input towards operating an efficient edge computing cloud environment and ensuring the overall health of the applications.  In this article, we’ll touch on some of the numerous challenges that can be encountered with the collection and transformation of data into a format that is serviceable for use in analytics, as well as how to construct a resilient data flow ensuring data continuity.