Ajitesh Kumar | The Blog Pros

October 28, 2020

How to Setup/Install MLFlow and Get Started

In this post, you will learn about how to setup/install MLFlow right from your Jupyter Notebook and get started tracking your machine learning projects. This would prove to be very helpful if you are running an enterprise-wide AI practice where you have a bunch of data scientists working on different ML projects. MLFlow will help you track the score of different experiments related to different ML projects.

Install MLFlow Using Jupyter Notebook

In order to install/set up MLFlow and do a quick POC, you could get started right from within your Jupyter notebook. Here are the commands to get set up. MLFlow could be installed with the simple command: pip install mlflow. Within Jupyter notebook, this is what you would do:

October 20, 2020

Adaline Explained With Python Example

In this post, you will learn the concepts of Adaline (ADAptive LInear NEuron), a machine learning algorithm, along with a Python example. Like Perceptron, it is important to understand the concepts of Adaline as it forms the foundation of learning neural networks. The concept of Perceptron and Adaline could found to be useful in understanding how gradient descent can be used to learn the weights which when combined with input signals is used to make predictions based on unit step function output.

Here are the topics covered in this post in relation to Adaline algorithm and its Python implementation:

October 9, 2020

NLTK Hello World Python Example

In this post, you will learn about getting started with natural language processing (NLP) with (Natural Language Toolkit), a platform to work with human languages using Python language. The post is titled hello world because it helps you get started with NLTK while also learning some important aspects of processing language. In this post, the following will be covered:

Install/Set up NLTK
Common NLTK commands for language processing operations

Install/Set up NLTK

This is what you need to do set up NLTK.

September 10, 2020

What, When, and How of Scatterplot Matrix in Python – Data Analytics

In this post, you will learn about some of the following in relation to scatterplot matrix. Note that scatter plot matrix can also be termed as pairplot. Later in this post, you would find Python code example in relation to using scatterplot matrix / pairplot (seaborn package).

What is scatterplot matrix?
When to use scatterplot matrix/pairplot?
How to use scatterplot matrix in Python?

What Is Scatterplot Matrix?

Scatter plot matrix is a matrix (or grid) of scatter plots where each scatter plot in the grid is created between different combinations of variables. In other words, scatter plot matrix represents bi-variate or pairwise relationship between different combinations of variables while laying them in grid form. Here is a sample scatter plot matrix created using Sklearn Iris dataset.

August 18, 2020

Imputing Missing Data Using Sklearn SimpleImputer

In this post, you will learn about how to use Python's Sklearn SimpleImputer for imputing/replacing numerical and categorical missing data using different strategies. In one of the related articles posted sometime back, the usage of fillna method of Pandas DataFrame is discussed. Here is the link, Replace missing values with mean, median and mode. Handling missing values is a key part of data preprocessing and hence, it is of utmost importance for data scientists/machine learning engineers to learn different techniques in relation imputing / replacing numerical or categorical missing values with appropriate value based on appropriate strategies.

The following topics will be covered in this post:

July 28, 2020

Decision Tree Classifier Python Code Example

In this post, you will learn about how to train a decision tree classifier machine learning model using Python. The following points will be covered in this post:

What is decision tree?
Decision tree python code sample

What Is a Decision Tree?

Simply speaking, the decision tree algorithm breaks the data points into decision nodes resulting in a tree structure. The decision nodes represent the question based on which the data is split further into two or more child nodes. The tree is created until the data points at a specific child node is pure (all data belongs to one class). The criteria for creating the most optimal decision questions is the information gain. The diagram below represents a sample decision tree.

July 10, 2019

The What, When, and Why of Regularization in Machine Learning

In this post, we will try and understand some of the following in relation to regularizing the regression of machine learning models to achieve higher accuracy and stable models:

Background
What is regularization?
Why and when does one need to adopt/apply the regularization technique?

Background

At times, when you are building a multi-linear regression model, you use the least-squares method for estimating the coefficients of determination or parameters for features. As a result, some of the following happens: