xgboost model training | The Blog Pros

March 23, 2020

XGBoost: A Deep Dive Into Boosting

Every day we hear about the breakthroughs in artificial intelligence. However, have you wondered what challenges it faces?

Challenges occur in highly unstructured data like DNA sequencing, credit card transactions, and even in cybersecurity, which is the backbone of keeping our online presence safe from fraudsters. Does this thought make you yearn to know more about the science and reasoning behind these systems? Do not worry! We’ve got you covered. In the cyber era, machine learning (ML) has provided us with the solutions to these problems with the implementation of Gradient Boosting Machines (GBM). We have ample algorithms to choose from to do gradient boosting for our training data but still, we encounter different issues like poor accuracy, high loss, large variance in the result.

Here, we are going to introduce you to a state of the art machine learning algorithm XGBoost built by Tianqi Chen, that will not only overcome the issues but also perform exceptionally well for regression and classification problems. This blog will help you discover the insights, techniques, and skills with XGBoost that you can then bring to your machine learning projects.

September 6, 2019

Positive Impact of Graph Technology and Neural Networks on Cybersecurity

Take a look into the future of cybersecurity

Breaches on the Rise

The Equifax security breach was among the worst ever in terms of the number of people affected and the type of information breached. Information such as names, SSNs, birth dates and addresses are considered the Holy Grail of personal data that allows hackers to gain access to anyone’s personal, financial, and health records.

While frequent incidents of security breaches have brought enough anxiety in corporate America, it’s the complexity of managing cybersecurity and addressing unanswered questions that really have enterprises nervous.

March 27, 2019

Selecting Optimal Parameters for XGBoost Model Training

There is always a bit of luck involved when selecting parameters for Machine Learning model training. Lately, I have worked with gradient boosted trees and XGBoost in particular. We are using XGBoost in the enterprise to automate repetitive human tasks. While training ML models with XGBoost, I created a pattern to choose parameters, which helps me to build new models quicker. I will share it in this post, and hopefully you will find it useful too.

I'm using the Pima Indians Diabetes Database for the training. CSV data can be downloaded from here.