Preview and Snapshot Features in StreamSets Data Collector

Hello from your newly-appointed community champion and technical evangelist here at StreamSets! My name is Dash Desai and you will find me writing blog posts and cruising the community forums answering questions about StreamSets Data Collector as well as learning from community members. I will also be presenting at meetups and conferences so if you happen to be attending, please stop by and say hi. My first post for StreamSets, explaining the powerful Preview and Snapshot features in Data Collector, was inspired by one of the community members (Thank you, Edward).

Introduction

When creating data pipelines for big data projects and working with a diverse set of structured, semi-structured, and unstructured data sources, it is imperative that you get a true sense of the data transformations at every stage. Not just to ensure data integrity and data quality but also for debugging and audit trail purposes. So phrases like "Garbage in, Garbage out", " Fail fast, Fail often", and " Agile and Iterative development " are also applicable to creating dataflow pipelines.