How GPT-Neo Can Be Used in Different Tasks

GPT3 has changed the level of language models and revolutionized AI by its capacity to learn with few examples, as GPT3 is a few-shot learner. However, it is not open-sourced, and access to OpenAI’s API is only available upon request. So EleutherAI is working on creating a similar model to GPT3, which is named GPT-Neo.

GPT-Neo is a transformer-based language model whose architecture is nearly the same architecture as the GPT3 model, and the results are also roughly equal to the lower versions of the GPT3 model. GPT-Neo is trained on the Pile Dataset. Same as GPT3, GPT-Neo is also a few-shot learner. And the good thing about GPT-Neo over GPT3 is it is an open-source model.

How to Utilize Python Machine Learning Models

Ever trained a new model and just wanted to use it through an API straight away? Sometimes you don't want to bother writing Flask code or containerizing your model and running it in Docker. If that sounds like you, you definitely want to check out MLServer. It's a Python-based inference server that recently went GA, and what's really neat about it is that it's a highly-performant server designed for production environments. That means that, by serving models locally, you are running in the exact same environment as they will be in when they get to production.

This blog walks you through how to use MLServer by using a couple of image models as examples.

The Three Must-Haves for Machine Learning Monitoring

Machine learning models are not static pieces of code but, instead, dynamic predictors that depend on data, hyperparameters, evaluation metrics, and many other variables; it is vital to have insight into the training and deployment process to prevent model drift predictive stasis. That said, not all monitoring solutions are created equal. These are the three must-haves for a machine learning monitoring tool, whether you decide to build or buy a solution.

Complete Process Visibility

Many applications involve multiple models working in tandem, and these models serve a higher business purpose which may be two or three steps downstream. Furthermore, the model's behavior will likely be dependent on data transformations that are multiple steps upstream. Thus, a simple monitoring system that focuses on single model behavior will not capture the holistic picture of model performance related to the global business context. More profound knowledge of model viability only comes from complete process visibility – having insight into the entire data flow, metadata, context, and overarching business processes on which the modeling is predicated. For example, as part of a credit approval application, a bank may deploy a suite of models that assess creditworthiness, screen for potential fraud, and dynamically allocate trending offers and promos. A simple monitoring system might be able to evaluate any one of these models individually, but solving the overall business problem demands an understanding of the interlocution between them. While they may have divergent modeling goals, each model rests upon a shared foundation of training data, context, and business metadata. Thus, an effective monitoring solution will take these disparate pieces into account and generate unified insights that harness this shared information. These might include identifying a niche and underutilized customer segments in the training data distribution, flagging potential instances of concept and data drift, understanding the aggregate model impact on business KPIs, and more. The best monitoring solutions can also work not only on ML models but also on generic, tabularized data, allowing the monitoring solution to be extended to all business use-cases, not just those involving an ML component.

Using Unsupervised Learning to Combat Cyber Threats

As the world enters a digital age, cyber threats are rising with massive data breaches, hacks into personal and financial data, and any other digital source that people can exploit. To combat these attacks, security experts are increasingly tapping into AI to stay a step ahead, using every tool in their toolbox, including unsupervised learning methods.

Machine learning in the cybersecurity space is still in its infancy stage, but there has been a lot of traction since 2020 to have more AI involved in combating cyber threats.

The Best MLOps Events and Conferences for 2022


Article Image

Introduction

2021 was, quite rightly, touted as “The Year of MLOps”. The MLOps scene exploded with thousands of companies adopting practices and tools aimed at helping them get models into production faster and more efficiently. A multitude of new vendors, consultancies, and open source tooling entered the field making it more important than ever to stay on top of what’s happening.

Throughout January I’ve been asking around to find out the best MLOps events people attended last year. There were loads of great suggestions to go through but a handful kept coming up over and over again. I’ve combined those with my own experiences to create a list of the events and conferences you definitely don’t want to miss:

Deploying Serverless NER Transformer Model with AWS Lambda

Introduction

With transformers becoming essential for many NLP tasks thanks to their unmatched performance, various useful and impactful NLP models are created every day. However, many NLP practitioners find it challenging to deploy models into production. According to this report, 90% of machine learning models never make it into production.

Model deployment enables you to host your model in a server environment so it can be used to output prediction when called by an API, for example.

What Is MLOps?

I recently started a new job at a Machine Learning startup. I’ve given up trying to explain what I do to non-technical friends and family (my mum still just tells people I work with computers). For those of you who at least understand that “AI” is just an overused marketing term for Machine Learning, I can break it down for you using the latest buzzword in the field:

MLOps

The term “MLOps” (a compound of Machine Learning and Operations) refers to the practice of deploying, managing, and monitoring machine learning models in production. It takes the best practices from the field of DevOps and utilizes them for the unique challenges that arise when running machine learning systems in production. 

Is AI Bias an Open-Ended Issue that needs an Unbiased Perspective?

As the AI continuum keeps ascending, certain elements of the realm keep getting reproached with justifiable vindications. Artificial Intelligence (AI), initially aimed at helping humans make fairer and more transparent calls has been progressively showing signs of bias and flawed decision making. But then, it isn’t the technology that should be blamed as what drives clarity asunder is the inadequate extraction and contextual techniques— something I shall be covering at length, later in this discussion.

How Is AI Bias Even a Thing?

 

Using Machine Learning to Detect Dupes: Some Real-Life Examples

As companies collect more and more data about their customers, an increased amount of duplicate information starts appearing in the data as well, causing a lot of confusion among internal teams. Since it would be impossible to manually go through all of the data and delete the duplicates, companies have come up with machine learning solutions that perform such work for them. Today we would like to take a look at some interesting uses of machine learning to catch duplicates in all kinds of environments. Before we dive right in, let’s take a look at how machine learning systems work.

How Do Machine Learning Systems Identify Duplicates?

When a person looks at an image or two strings of data it would be fairly easy for them to determine whether or not the images or strings are duplicates. However, how would you train a machine to spot such duplicates? Perhaps a good starting point would be to identify all of the similarities, but then you would need to explain exactly what 'similar' means. Are there gradations to similarities? In order to overcome such challenges, researchers use string metrics to train machine learning models.

NLP Features That Are Criminally Overlooked: The Case for SAO

In the reading of Natural Language Processing (NLP) applications, we inevitably encounter two main features in action, Categorization and Extraction, and learn how those can be manipulated in so many different ways to effectively address use cases that involve free-form text and the retrieval of information from it. We also hear a lot about Sentiment (which technically is not a separate feature but rather a specialization of the previous two). Finally, we have POS-tagging, which is only occasionally mentioned outside of deeply technical articles for linguists and NLP professionals. 

We don’t hear much about other NLP capabilities, and this is mainly because often, depending on how an NLP engine is designed, features beyond Categorization and Extraction are not present at all. Specifically, many NLP solutions today are based on Machine Learning algorithms, and ML rarely delivers great accuracy in problems that require both elevated precision and super-fine identification. Then again, some of these capabilities are incredibly useful, in fact sometimes even the only way to address a particular challenge. In the following, I make the case for one of these not-so-popular NLP tools: SAO (Subject-Action-Object).

Innovative Algorithms Are Assisting AI Systems To Escape From ‘Adversarial’ Attacks

Introduction

In this modern and innovative world, individuals are capable enough to get what they see. The role of artificial intelligence would be simultaneously straightforward in that case. Artificial intelligence is one of the most famous data-driven technologies emerging at a swift pace, accommodating the whole world. There would be no surprise in saying that the market size of artificial intelligence is growing dramatically and will reshape the dimensions of technological advancements in the upcoming future. In 2019, the market size of artificial intelligence was estimated at $27.23 billion. This figure projects that the market size will value AI at $266.92 billion by 2027.  

Let us consider the collision avoidance system in self-driven cars. An AI system can directly map an input to an appropriate action if visual input to on-board cameras is entirely trusted i.e. steer left, steer right or go straight continuously in order to dodge any ramblers that cameras notice on the road. But what if the camera is manipulated or slightly shifts images by a few pixels? The car might take potentially unnecessary and dangerous actions if it starts trusting adversarial inputs blindly. 

GIS: A Game Changer For Telecom Providers

GIS can be used to great success in the telecommunications industry, including designing and applying efficient infrastructure to power 5G. Throughout the stages of development, GIS makes each step easier.

Every part of a telecom’s business involves location. All the vital information and key data points, including customer information, network ownership, weather forecasts, and competitor information, are often managed in separate systems. Using maps and location is often the most intuitive way to gain real operational awareness about where things are happening and how certain user and network behaviors relate to each other.

Why Python Is Best for Machine Learning

Today, most companies are using Python for AI and Machine Learning. With predictive analytics and pattern recognition becoming more popular than ever, Python development services are a priority for high-scale enterprises and startups. Python developers are in high-demand — mostly because of what can be achieved with the language. AI programming languages need to be powerful, scalable, and readable. Python code delivers on all three.

While there are other technology stacks available for AI-based projects, Python has turned out to be the best programming language for this purpose. It offers great libraries and frameworks for AI and Machine Learning (ML), as well as computational capabilities, statistical calculations, scientific computing, and much more. 

5 Papers on Product Classification Every Data Scientist Should Read

Product categorization/product classification is the organization of products into their respective departments or categories. A large part of the process is the design of the product taxonomy as a whole. 

Product categorization was initially a text classification task that analyzed the product’s title to choose the appropriate category. However, numerous methods have been developed which take into account the product title, description, images, and other available metadata. 

The following papers on product categorization represent essential reading in the field and offer novel approaches to product classification tasks.

1. Don’t Classify, Translate

In this paper, researchers from the National University of Singapore and the Rakuten Institute of Technology propose and explain a novel machine translation approach to product categorization. The experiment uses the Rakuten Data Challenge and Rakuten Ichiba datasets. 

Their method translates or converts a product’s description into a sequence of tokens which represent a root-to-leaf path to the correct category. Using this method, they are also able to propose meaningful new paths in the taxonomy.

The researchers state that their method outperforms many of the existing classification algorithms commonly used in machine learning today.
  • Published/Last Updated – Dec. 14, 2018
  • Authors and Contributors – Maggie Yundi Li (National University of Singapore), Stanley Kok (National University of Singapore), and Liling Tan (Rakuten Institute of Technology)

2. Large-Scale Categorization of Japanese Product Titles Using Neural Attention Models

The authors of this paper propose attention convolutional neural network (ACNN) models over baseline convolutional neural network (CNN) models and gradient boosted tree (GBT) classifiers. 

The study uses Japanese product titles taken from Rakuten Ichiba as training data. Using this data, the authors compare the performance of the three methods (ACNN, CNN, and GBT) for large-scale product categorization. 

While differences in inaccuracy can be less than 5%, even minor improvements in accuracy can result in millions of additional correct categorizations. 

Lastly, the authors explain how an ensemble of ACNN and GBT models can further minimize false categorizations.


  • Published/Last Updated – April, 2017 for EACL 2017
  • Authors and Contributors – From the Rakuten Institute of Technology: Yandi Xia, Aaron Levine, Pradipto Das Giuseppe Di Fabbrizio, Keiji Shinzato and Ankur Datta 

3. Atlas: A Dataset and Benchmark for Ecommerce Clothing Product Classification

Researchers at the University of Colorado and Ericsson Research (Chennai, India) have created a large product dataset known as Atlas. In this paper, the team presents its dataset which includes over 186,000 images of clothing products along with their product titles. 


Deep Dive Into Join Execution in Apache Spark

Join operations are often used in a typical data analytics flow in order to correlate two data sets. Apache Spark, being a unified analytics engine, has also provided a solid foundation to execute a wide variety of Join scenarios.

At a very high level, Join operates on two input data sets and the operation works by matching each of the data records belonging to one of the input data sets with every other data record belonging to another input data set. On finding a match or a non-match (as per a given condition), the Join operation could either output an individual record, being matched, from either of the two data sets or a Joined record. The joined record basically represents the combination of individual records, being matched, from both the data sets.

Computational Needs for Computer Vision (CV) in AI and ML Systems

Common Challenges Associated With CV Systems Employing ML Algorithms

Computer vision (CV) is a major task for modern Artificial Intelligence (AI) and Machine Learning (ML) systems. It’s accelerating nearly every domain in the tech industry enabling organizations to revolutionize the way machines and business systems work.

Academically, it is a well-established area of computer science and many decades worth of research work have gone into this field. However, the use of deep neural networks has recently revolutionized the CV field and given it new oxygen.

Making the Transition from Software Engineer to Artificial Intelligence Engineer

Artificial intelligence (AI) technology has been around for decades. However, we really didn’t realize its potential until about a decade ago. Since then, the planet saw an exponential demand for AI engineers. 

As the ongoing tech talent shortage shows no signs of improving, it has provided software engineers (who are also in high demand) an opportunity to make the transition and fill the talent gap. However, learning AI, Machine Learning (ML), and Natural Language Processing (NLP) isn’t a walk in the park.