Ever trained a new model and just wanted to use it through an API straight away? Sometimes you don't want to bother writing Flask code or containerizing your model and running it in Docker. If that sounds like you, you definitely want to check out MLServer. It's a Python-based inference server that recently went GA, and what's really neat about it is that it's a highly-performant server designed for production environments. That means that, by serving models locally, you are running in the exact same environment as they will be in when they get to production.
This blog walks you through how to use MLServer by using a couple of image models as examples.