Filiberto Emanuele | The Blog Pros

Any solution aimed at processing unstructured data (i.e., language, specifically text in most cases) is today based on one of two main approaches: Machine Learning and Symbolic. Both can be delivered in multiple ways (different algorithms in the case of ML, from shallow linguistics to semantic technology in the case of Symbolic), but not much has been done so far in the realm of hybrid approaches. While choosing one over the other is always going to present a compromise between advantages and drawbacks (higher accuracy coming from Symbolic, more flexibility derived from ML), Hybrid AI — or Hybrid NL — is a revolutionary path to solve linguistic challenges that can leverage the best of both worlds and, ultimately, make your NLP practices graduate to NLU (Natural Language Understanding). I won’t spend time explaining how ML or Symbolic work since there’s a ton of literature about that already, I’ll focus this page on Hybrid instead.

What is Going Hybrid?

To frame this conversation in a practical fashion, we must look at two aspects: development, and workflow. At the development stage, going hybrid means that a Symbolic solution will support the creation of a Machine Learning model in order to either reduce the effort or enhance its quality. On the other hand, at the production stage, our workflow can be supported by both ML and Symbolic to deliver a more precise outcome. In a project that considers the Machine Learning piece the pivot of the solution, the first type of integration places Symbolic at the top (before even creating a Machine Learning model), and the second one at the bottom (curating or enhancing the final output). Naturally, both of these hybrid ways can be present at the same time in a linguistic project.

In the reading of Natural Language Processing (NLP) applications, we inevitably encounter two main features in action, Categorization and Extraction, and learn how those can be manipulated in so many different ways to effectively address use cases that involve free-form text and the retrieval of information from it. We also hear a lot about Sentiment (which technically is not a separate feature but rather a specialization of the previous two). Finally, we have POS-tagging, which is only occasionally mentioned outside of deeply technical articles for linguists and NLP professionals.

We don’t hear much about other NLP capabilities, and this is mainly because often, depending on how an NLP engine is designed, features beyond Categorization and Extraction are not present at all. Specifically, many NLP solutions today are based on Machine Learning algorithms, and ML rarely delivers great accuracy in problems that require both elevated precision and super-fine identification. Then again, some of these capabilities are incredibly useful, in fact sometimes even the only way to address a particular challenge. In the following, I make the case for one of these not-so-popular NLP tools: SAO (Subject-Action-Object).

Author: Filiberto Emanuele

A Quick Word on Hybrid AI in Natural Language Processing

What is Going Hybrid?

NLP Features That Are Criminally Overlooked: The Case for SAO