Getting Started With Pandas: Lesson 4

Introduction

We begin with the fourth and final article of our saga of training with Pandas. In this article, we are going to make a summary of the different functions that are used in Pandas to perform missing data treatment. Dealing with missing data is key and a standard challenge of day-by-day data science work, and it has a direct impact on algorithmic performance.

Missing Data

Before we start, we are going to visualize the example dataset that we are going to follow to explain the functions. It is a dataset created by us that includes several cases of use to be able to clearly deal with all the examples that we will call `uncompleted_data`.

Getting Started With Numpy

NumPy is a third-party library for numerical computing, optimized for working with single- and multi-dimensional arrays. Its primary type is the array type called ndarray. This library contains many routines for statistical analysis.

Creating, Getting Info, Selecting, and Util Functions

The 2009 data set 'Wine Quality Dataset' elaborated by Cortez et al. available at UCI Machine Learning, is a well-known dataset that contains wine quality information. It includes data about red and white wine physicochemical properties and a quality score. 

Cooperative Multi-Agent Reinforcement Learning and QMIX at NeurIPS 2021

Authors: Gema Parreño, David Suarez (Apiumhub), with thanks to: Alberto Hernandez (BBVA Innovation Labs).

The following post aims to introduce Cooperative MARL and goes through innovations by S. Whiterson Lab, with QMIX (2019), and their current contributions for NeurIPS 2021. Going through this article might imply having certain fundamentals about Reinforcement Learning.

Getting Started With Pandas

Today we will introduce one of the first inner training chapters on the fundamentals of DataScience treatment tools. We are talking about Pandas, Numpy, and Matplotlib. Pandas is a third-party library for numerical computing based on NumPy. It excels in handling labeled one-dimensional (1D) data with Series objects and two-dimensional (2D) data with DataFrame objects.

NumPy is a third-party library for numerical computing, optimized for working with single- and multi-dimensional arrays. Its primary type is the array type called ndarray. This library contains many routines for statistical analysis.

Getting Started With Pandas – Lesson 2

Introduction

We begin with the second post of our training saga with Pandas. In this article, we are going to make a summary of the different functions that are used in Pandas to perform Indexing, Selection, and Filtering.

Indexing, Selecting, and Filtering

Before we start, we are going to visualize ahead of our didactic dataset that we are going to follow to show the examples. It is a well-known dataset that contains wine information.

Getting Started With Pandas – Lesson 3

Introduction

We begin with the third post of our data science training saga with Pandas. In this article, we are going to make a summary of the different functions that are used in Pandas to perform Iteration, Maps, Grouping, and Sorting. These functions allow us to make transformations of the data giving us useful information and insights.

Iteration, Maps, Grouping, and Sorting

The 2009 data set  ‘Wine Quality Dataset’ elaborated by Cortez et al. available at UCI Machine Learning, is a well-known dataset that contains wine quality information. It includes data about red and white wine physicochemical properties and a quality score.