How to Evaluate MLOps Platforms

Companies that have pioneered the application of AI at scale did so using their own in-house ML platforms (uber, LinkedIn, Facebook, Airbnb). Many vendors are now making these capabilities available to purchase as off-the-shelf products. There's also a range of open-source tools addressing MLOps. The rush to the space has created a new problem — too much choice. There are now hundreds of tools and at least 40 platforms available:

(Timeline image from Thoughtworks Guide to Evaluating MLOps Platforms.)

Movie Recommendations With Spark Collaborative Filtering

Collaborative filtering (CF)[1] based on the alternating least squares (ALS) technique[2] is another algorithm used to generate recommendations. It produces automatic predictions (filtering) about the interests of a user by collecting preferences from many other users (collaborating). The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B's opinion on a different issue than a randomly chosen person. This algorithm gained a lot of traction in the data science community after it was used by the team winner of the Netflix Prize.

The CF algorithm has also been implemented in Spark MLlib[3] with the aim to address fast execution on very large datasets. KNIME Analytics Platform with its Big Data Extensions offers it in the Spark Collaborative Filtering node. We will use it here to recommend movies to a new user within a KNIME implementation of the collaborative filtering solution provided in the Infofarm blog post[4].