big data tools | The Blog Pros

August 12, 2020

8 Best Big Data Tools in 2020

In today’s reality, data gathered by a company is a fundamental source of information for any business. Unfortunately, it is not that easy to drive valuable insights from it.

Problems with which all data scientists are dealing are the amount of data and its structure. Data has no value unless we process it. To do so, we need big data software that will help us in transforming and analyzing data.

March 4, 2019

The State of Big Data Tools, Frameworks, and Languages [Podcast]

Time for a news brief! This compilation of notable quotes come as part our research on Big Data. Listen in to learn about the languages and tools devs are using for their Big Data apps. (Hint: R, Python, and Hadoop feature pretty prominently.)

These podcasts are compiled from conversations our analyst Tom Smith has had with industry experts from around the world as part of his work on our research guides.

February 21, 2019

Devs and Data, Part 1: Big Data on the Rise

This article is part of the Key Research Findings from the new DZone Guide to Big Data: Volume, Variety, and Velocity.

Introduction

For this year’s big data survey, we received 459 responses with a 78% completion rating. Based on this response rate, we have calculated the margin of error for this survey to be 5%. Using the data from these responses, we've put together an article on how various sub-fields of big data are on the rise and how devs are becoming more data-driven.

January 16, 2019

Use Materialized Views to Turbo-Charge BI, Not Proprietary Middleware

Query performance has always been an issue in the world of business intelligence (BI), and many BI users would be happy to have their reports load and render quicker. Traditionally, the best way to achieve this performance (short of buying a bigger database) has been to build and maintain aggregate tables at various levels to intercept certain groups of queries to prevent repeat queries of the same raw data. Also, many BI tools pull data out of databases into their own memory, into “cubes” of some sort, and run analyses off of those extracts.

Downsides of Aggregates and Cubes

Both of these approaches have the major downside of needing to maintain the aggregate or cube as new data arrives. In the past, that has been a daily event, but most warehouses are now being stream-fed in near real-time. It’s not practical to continuously rebuild aggregate tables or in-memory cubes every time a new row arrives or a historical row is updated.