GitHub Is Bad for AI: Solving the ML Reproducibility Crisis

There is a crisis in machine learning that is preventing the field from progressing as fast as it could. It stems from a broader predicament surrounding reproducibility that impacts scientific research in general. A Nature survey of 1,500 scientists revealed that 70% of researchers have tried and failed to reproduce another scientist’s experiments, and over 50% have failed to reproduce their own work. Reproducibility, also called replicability, is a core principle of the scientific method and helps ensure the results of a given study aren’t a one-off occurrence but instead represent a replicable observation.

In computer science, reproducibility has a more narrow definition: Any results should be documented by making all data and code available so that the computations can be executed again with the same results. Unfortunately, artificial intelligence (AI) and machine learning (ML) are off to a rocky start when it comes to transparency and reproducibility. For example, take this response published in Nature by 31 scientists that are highly critical of a study from Google Health that documented successful trials of AI that detects signs of breast cancer. 

How to Automate Hyperparameter Optimization

A Beginner’s Guide to Using Bayesian Optimization With Scikit-Optimize

In the machine learning and deep learning paradigm, model “parameters” and “hyperparameters” are two frequently used terms where “parameters” define configuration variables that are internal to the model and whose values can be estimated from the training data and “hyperparameters” define configuration variables that are external to the model and whose values cannot be estimated from the training data (What is the Difference Between a Parameter and a Hyperparameter? ). Thus, the hyperparameter values need to be manually assigned by the practitioner.

Every machine learning and deep learning model that we make has a different set of hyperparameter values that need to be fine-tuned to be able to obtain a satisfactory result. Compared to machine learning models, deep learning models tend to have a larger number of hyperparameters that need optimizing in order to get the desired predictions due to its architectural complexity over typical machine learning models.