NLP | The Blog Pros

January 11, 2024

Getting Started With Large Language Models

Large language models (LLMs) have emerged as transformative tools, unraveling the complexities of natural language understanding and paving the way for modern applications. The primary purpose of this Refcard is to provide an end-to-end understanding of LLM architecture, training methodologies, as well as applications of advanced artificial intelligence models in natural language processing. Offering an introduction and practical insights on how to navigate the intricacies of harnessing LLMs, this Refcard serves as a comprehensive guide for both novices and seasoned practitioners seeking to unlock the capabilities of these powerful language models.

June 16, 2023

The Power of Template-Based Document Generation with NLP and AI in Python

In today's digital age, document generation plays a crucial role in various industries and sectors. The efficiency and accuracy of document generation significantly impact business processes, productivity, and customer satisfaction.

One powerful approach to streamlining document creation is template-based document generation.

September 29, 2022

How to Use Hugging Face Models for NLP, Audio Classification, and Computer Vision

Those who have spent any time studying models and frameworks for things like audio classification projects, NLP, and/or computer vision, are likely wondering how to use Hugging Face for some of these models. Hugging Face is a platform that serves both as a community for those working with data models as well as a hub for data science models and information.

When using Hugging Face for NLP, audio classification, or computer vision users will need to know what Hugging Face has to offer for each project type as opposed to other options. Users will also need to have a deeper understanding of what a Hugging Face model is and how to use Hugging Face for their own data science projects.

June 19, 2022

My Sentiments, Erm… Not Exactly

This article is an excerpt from the book Transformers for Natural Language Processing, Second Edition. This edition includes working with GPT-3 engines, more use cases, such as casual language analysis and computer vision tasks, and an introduction to OpenAI's Codex.

Have a look at the following sentence:

May 2, 2022

The 5 Healthcare AI Trends Technologists Need to Know

Healthcare has been at the epicenter of everything we do for two years. While the pandemic has been a significant driver of the conversation, healthcare technology—artificial intelligence (AI) specifically—has been experiencing explosive growth. One only needs to look at the funding landscape: more than 40 startups have raised at least $20 million in funding specifically to build AI solutions for healthcare applications.

But what’s driving this growth? The venture capital trail alone won’t help us understand the trends contributing to AI adoption in healthcare. But the “2022 AI in Healthcare Survey” will. For the second year, Gradient Flow and John Snow Labs asked 300 global respondents what they’re experiencing in their AI programs—from the individuals using them to the challenges and the criteria used to build solutions and validate models. These are the top five trends that emerged from the research.

April 22, 2022

NLP Chatbot Resiliency: A Chat With Botpress

In the race to design great conversational experiences, adaptable NLU models will play a key role in the creation of truly intelligent chatbots. In this article, learn how Botpress stemmed from frustrations with poorly designed bots that led to a launch of an open-source managed NLU platform. Also, see why the future of chatbot design will shift from an intents-based approach toward knowledge-based models that offer greater adaptability and resiliency.

Developer Accessibility Key to NLP Chatbot Advancement

Chatbots have come a long way over the years, evolving from simple command-response models to the more nuanced NLP conversational models of today.

January 22, 2022

Interpretable and Explainable NER With LIME

While a lot of progress has been made to develop the latest greatest, state-of-the art, deep learning models with a gazillion parameters, very little effort has been given to explain the output of these models.

During a workshop in December 2020, Abubakar Abid, CEO of Gradio, examined the way GPT-3 generates text about religions by using the prompt, “Two _ walk into a.” Upon observing the first 10 responses for various religions, he found that GPT-3 mentioned violence once each for Jews, Buddhists, and Sikhs, twice for Christians, but nine out of ten times for Muslims”.

November 13, 2021

A Quick Word on Hybrid AI in Natural Language Processing

Any solution aimed at processing unstructured data (i.e., language, specifically text in most cases) is today based on one of two main approaches: Machine Learning and Symbolic. Both can be delivered in multiple ways (different algorithms in the case of ML, from shallow linguistics to semantic technology in the case of Symbolic), but not much has been done so far in the realm of hybrid approaches. While choosing one over the other is always going to present a compromise between advantages and drawbacks (higher accuracy coming from Symbolic, more flexibility derived from ML), Hybrid AI — or Hybrid NL — is a revolutionary path to solve linguistic challenges that can leverage the best of both worlds and, ultimately, make your NLP practices graduate to NLU (Natural Language Understanding). I won’t spend time explaining how ML or Symbolic work since there’s a ton of literature about that already, I’ll focus this page on Hybrid instead.

What is Going Hybrid?

To frame this conversation in a practical fashion, we must look at two aspects: development, and workflow. At the development stage, going hybrid means that a Symbolic solution will support the creation of a Machine Learning model in order to either reduce the effort or enhance its quality. On the other hand, at the production stage, our workflow can be supported by both ML and Symbolic to deliver a more precise outcome. In a project that considers the Machine Learning piece the pivot of the solution, the first type of integration places Symbolic at the top (before even creating a Machine Learning model), and the second one at the bottom (curating or enhancing the final output). Naturally, both of these hybrid ways can be present at the same time in a linguistic project.

October 25, 2021

Text Preprocessing Methods for Deep Learning

Deep Learning, particularly Natural Language Processing (NLP), has been gathering a huge interest nowadays. Some time ago, there was an NLP competition on Kaggle called Quora Question insincerity challenge. The competition is a text classification problem and it becomes easier to understand after working through the competition, as well as by going through the invaluable kernels put up by the Kaggle experts.

First, let’s start by explaining a little more about the text classification problem in the competition.

October 21, 2021

Getting Started With Robotic Process Automation

Technologies such as artificial intelligence (AI), machine learning (ML), and natural language processing (NLP) have led the way to software robots that reduce the manual, time-consuming, and repetitive actions performed on digital platforms. The concept of automating tasks on digital platforms is called robotic process automation (RPA). RPA is a software robot that interacts with computer-centric processes and aims to introduce a digital workforce that performs repetitive tasks previously completed by humans. This Refcard introduces RPA technology, how it works, key components, and how to set up your environment.

October 6, 2021

Build a Plagiarism Checker Using Machine Learning

Plagiarism is rampant on the internet and in the classroom. With so much content out there, it’s sometimes hard to know when something has been plagiarized. Authors writing blog posts may want to check if someone has stolen their work and posted it elsewhere. Teachers may want to check students’ papers against other scholarly articles for copied work. News outlets may want to check if a content farm has stolen their news articles and claimed the content as its own.

So, how do we guard against plagiarism? Wouldn’t it be nice if we could have software do the heavy lifting for us? Using machine learning, we can build our own plagiarism checker that searches a vast database for stolen content. In this article, we’ll do exactly that.

September 8, 2021

Exploring BERT Language Framework for NLP Tasks

NLP is one of the most crucial components for structuring a language-focused AI program, for example, the chatbots which readily assist visitors on the websites and AI-based voice assistants or VAs. NLP as the subset of AI enables machines to understand written language and interpret the intent behind it by various means. A hoard of other tasks is being added via NLP like sentiment analysis, text classification, text extraction, text summarization, speech recognition, auto-correction, etc.

However, NLP is being explored for many more tasks. There have been many advancements lately in the field of NLP and also NLU (natural language understanding) which are being applied on many analytics and modern BI platforms. Advanced applications are using ML algorithms with NLP to perform complex tasks by analyzing and interpreting a variety of content.

August 25, 2021

Boosted Embeddings with Catboost

Introduction

When working with a large amount of data, it becomes necessary to compress the space with features into vectors. An example is text embeddings, which are an integral part of almost any NLP model creation process. Unfortunately, it is far from always possible to use neural networks to work with this type of data — the reason, for example, maybe a low fitting or inference rates.

I want to suggest an interesting way to use gradient boosting that few people know about.

August 23, 2021

Top 3 NLP Use Cases for ITSM

What is NLP

Natural Language Processing is a specialized subdomain of Machine Learning which is generally concerned with the interactions between the human and machine using a human verbal or written language.

NLP helps in processing huge volumes of text which would take a significant amount of time for a human to comprehend and process otherwise. Hence a lot of organizations take advantage of NLP to gain useful insights out of their text and free formatted data.

August 9, 2021

How to Fine-Tune BERT Transformer With spaCy v3.0

Since the seminal paper “Attention Is All You Need” of Vaswani et al, transformer models have become by far the state of the art in NLP technology. With applications ranging from NER, text classification, question answering, or text generation, the applications of this amazing technology are limitless.

More specifically, BERT — which stands for Bidirectional Encoder Representations from Transformers — leverages the transformer architecture in a novel way. For example, BERT analyses both sides of the sentence with a randomly masked word to make a prediction. In addition to predicting the masked token, BERT predicts the sequence of the sentences by adding a classification token [CLS] at the beginning of the first sentence and tries to predict if the second sentence follows the first one by adding a separation token [SEP] between the two sentences.

June 11, 2021

How to Spellcheck Words and Sentences in Java

In our technology-driven world, electronic communication is increasingly overshadowing verbal communication. Whether we are filling out an online form, sending a text on our phone, or writing an email, it is a fact that many of our business and personal interactions require efficiently written (typed) language. Due to this heavy reliance on electronic communication, it is critical to ensure your online platform has a support system built-in to account for human error; if your application or website allows input or search queries from users, you run the risk of your systems not understanding the text due to spelling errors — and this is where spellcheck comes in.

When you stop and consider how many times you encounter spellcheck in your electronic interactions, it should become clear that it has created a huge failsafe for our often rushed and impatient natures. Spellcheck has come a long way since its beginnings; the first spellcheckers simply verified words instead of suggesting corrections. Fast forward to our current era and spellcheckers have improved in both functionality and efficiency; they operate in the background of our applications and let us know with a red line that we have made a potential error. This is often accomplished with Natural Language Processing (NLP) which, as we have discussed previously, enables computers to process and interpret human language in the form of text or audio data.

April 13, 2021

5 Great Ways To Achieve Complete Automation With AI and ML

Introduction

Automation in the testing domain has evolved a lot when it comes to Artificial Intelligence and Machine Learning specifically. Self-driven cars, bots, and the famous Amazon-owned product, Alexa are some of the basic examples of how AL and ML have influenced our lives and day-to-day activities. With updated application software and devices making users' lives easier than ever, emphasis on the demand for product quality for users has increased. Customers are becoming intolerant to product defects with the number of alternatives available to them to switch in the market. The statistics mentioned below are true when talking about the loyalty a customer can portray for a particular product or service for a company.

"91% of non-complainers just leave and 13% of them tell 15 more people about their bad experience for a product."

March 23, 2021

How to Extract Sentences and Entities From a String in Java

In this article, we will be discussing more great ways to utilize Natural Language Processing. As we have discussed in previous articles, natural language processing combines linguistics and artificial intelligence to perform large amounts of natural language data analysis. Essentially, this technology can simplify the scanning of content by categorizing and organizing it through machine learning. While these rules were formerly coded by hand, automatic learning has improved the process by leveraging statistical inference algorithms to produce models that can process unfamiliar or inaccurate information.

The two tasks that we will be covering today are how to extract sentences and how to extract entities from a string in Java. Extracting sentences from a string can be an incredibly time-consuming operation if you’re trying to parse chunks of text, but with the help of an NLP API, it becomes a quick and easy step. The API will scan the input string and return the separated sentences as individual strings, instantly making the text more readable for you or your customers.