data mining | The Blog Pros

March 17, 2022

Unsupervised Learning in Data Mining: Apriori Algorithm

This post will share my knowledge about unsupervised learning in data mining with the simplest algorithm, which we used to generate associated rules to determine the related grocery items customers bought from our e-commerce application/retail stores.

Before jumping ahead, Let’s understand a few terms which I will be using in this article.

May 7, 2021

Data Mining: Use Cases, Benefits, and Tools

In the last decade, advances in processing power and speed have allowed us to move from tedious and time-consuming manual practices to fast and easy automated data analysis. The more complex the data sets collected, the greater the potential to uncover relevant information. Retailers, banks, manufacturers, healthcare companies, etc., are using data mining to uncover the relationships between everything from price optimization, promotions, and demographics to how economics, risk, competition, and online presence affect their business models, revenues, operations, and customer relationships. Today, data scientists have become indispensable to organizations around the world as companies seek to achieve bigger goals than ever before with data science. In this article, you will learn about the main use cases of data mining and how it has opened up a world of possibilities for businesses.

Today, organizations have access to more data than ever before. However, making sense of the huge volumes of structured and unstructured data to implement improvements across the organization can be extremely difficult due to the sheer volume of information.

April 9, 2021

Top 21 Data Mining Tools

Data mining is a world itself, which is why it can easily get very confusing. There is an incredible number of data mining tools available in the market. However, while some might be more suitable for handling data mining in Big Data, others stand out for their data visualization features.

As is explained in this article, data mining is about discovering patterns in data and predicting trends and behaviors. Simply put, it is the process of converting vasts sets of data into relevant information. There is not much use in having massive amounts of data if we do not actually know what it means.

February 2, 2021

What Happened When PVS-Studio Checked ELKI in January

If you feel like the New Year just came, and you missed the first half of January, then all this time you've been busy looking for tricky bugs in the code you maintain. It also means that our article is what you need. PVS-Studio has checked the ELKI open source project to show you errors that may occur in the code, how cunningly they can hide there, and how you can deal with them.

What Kind of Library Is ELKI?

The abbreviation ELKI stands for Environment for DeveLoping KDD-Applications Supported by Index-Structures. This project is written in Java and is designed for data mining. Most users of this library are students, researchers, data scientists, and software engineers. No wonder, since this library was developed for research only.

January 27, 2020

Data Exploration and Data Preparation for Business Insights

What Is Data Exploration?

Data Exploration or Exploratory data analysis (EDA) provides a simple set of exploration tools that bring out the basic understanding of real-time data into data analytics. The outcomes of data exploration can be a powerful factor in understanding the structure of data, values distributions, and interrelationships. Data exploration can also be helpful for data scientists to gain proper insights into business data that was not easily seen previously.

January 2, 2020

Listener Log Data Mining With SQL

If you take a look at the log files created by the listener, there is obviously a nice wealth of information in there. We get service updates, connections, etc., all of which might be useful, particularly in terms of auditing security

However, it also is in a fairly loose text format, which means ideally I’d like to utilize the power of SQL to mine the data.

December 27, 2019

How AI Is Saving Lives and Stopping Human Trafficking

How AI Is Saving Lives and Stopping Human Trafficking

Human trafficking is a horrific crime that impacts between two and four million victims. The impact of modern-day slavery is far-reaching and affects families across the globe. Social scientists, developers, and law enforcement are working together to cut down on the number of people victimized by human trafficking.

Everyone involved was at a loss when trying to track down and arrest these criminals. Worse yet, the trafficker would likely only get charged with one crime, even if they were running a trafficking empire.

November 1, 2019

Splitting Lines and Numbering the Pieces

As I mentioned in my computational survivalist post, I’m working on a project where I have a dedicated computer with little more than basic Unix tools, ported to Windows. It’s given me a new appreciation for how the standard Unix tools fit together; I’ve had to rely on them for tasks I’d usually do a different way.

I’d seen the nl command before for numbering lines, but I thought, “Why would you ever want to do that? If you want to see line numbers, use your editor.” That way of thinking looks at the tools one at a time, asking what each can do, rather than thinking about how they might work together.

April 19, 2019August 12, 2019

What Are the Major Advantages of Using a Graph Database?

A graph database is a data management system software. The building blocks are vertices and edges. To put it in a more familiar context, a relational database is also a data management software in which the building blocks are tables. Both require loading data into the software and using a query language or APIs to access the data.

Relational databases boomed in the 1980s. Many commercial companies (i.e. Oracle, Ingres, IBM) backed the relational model (tabular organization) of data management. In that era, the main data management need was to generate reports.

February 22, 2019

Text Mining 101: What it Is and How it Works

The modern world generates enormous amounts of data, and it is growing year by year. Data has become the most valuable managerial resource to provide a competitive edge and create knowledge management initiatives. Now manual data processing and classification has become costly and ineffective — and it has to be either automated entirely or used only when the important data is already selected automatically from the total quantity.

Text mining is essentially the automated process of deriving high-quality information from text. Its main difference from other types of data analysis is that the input data is not formalized in any way, which means it cannot be described with a simple mathematical function.

February 1, 2019

What Is Data Mining?

Everyone wants an edge. And in the digital age of business, the greatest strategic advantage comes from slicing, dicing, and analyzing data from every possible angle.

Data mining is the automated process of sorting through huge data sets to identify trends and patterns and establish relationships. And as enterprise data proliferates — now over 2.5 quintillion bytes per day — it'll continue to play an increasingly important role in the way businesses plan their operations and address challenges in the future.