Python, NoSQL and FastAPI Tutorial: Web Scraping on a Schedule

Web Scraping image

Can there be other use cases for Cassandra beyond messaging and chat? In this tutorial, we show you how to web scrape on a schedule by integrating the Python framework called FastAPI with Astra DB, a serverless, managed Database-as-a-Service built on Cassandra.

Recently, I caught up with the Pythonic YouTuber Justin Mitchell from CodingEntrepreneurs and we discussed how today’s apps are tackling global markets and issues. He pointed out that Discord stores 120 million messages with only four backend engineers—and that was back in 2017.

How To Perform FastAPI Path Parameter Validation Using Path Function

In this post, we will learn how to perform FastAPI Path Parameter Validation. We will specifically also learn how to use the Path() function to handle the validations in a declarative approach.

Using the Path function is quite similar to using the Query function. In other words, we can declare validations as well as meta-data using the Path function. Please refer to the post on FastAPI Query Parameter validation to know more about the same.

Why Choose FastAPI over Flask?

Article Image To help you quickly get started with Milvus, the open-source vector database, we released another affiliated open-source project, Milvus Bootcamp on GitHub. The Milvus Bootcamp not only provides scripts and data for benchmark tests, but also includes projects that use Milvus to build some MVPs (minimum viable products), such as a reverse image search system, a video analysis system, a QA chatbot, or a recommender system. You can learn how to apply vector similarity search in a world full of unstructured data and get some hands-on experience in Milvus Bootcamp.

We provide both front-end and back-end services for the projects in Milvus Bootcamp. However, we have recently made the decision to change the adopted web framework from Flask to FastAPI. This article aims to explain our motivation behind such a change in the adopted web framework for Milvus Bootcamp by clarifying why we chose FastAPI over Flask.

Web Frameworks for Python

A web framework refers to a collection of packages or modules. It is a set of software architecture for web development that allows you to write web applications or services and saves you the trouble of handling low-level details such as protocols, sockets, or process/thread management. Using a web framework can significantly reduce the workload of developing web applications as you can simply "plugin" your code into the framework, with no extra attention needed when dealing with data caching, database access, and data security verification. For more information about what a web framework for Python is, see Web Frameworks.  

There are various types of Python web frameworks. The mainstream ones include Django, Flask, Tornado, and FastAPI.
  • Flask

Flask is a lightweight micro-framework designed for Python, with a simple and easy-to-use core that allows you to develop your own web applications. In addition, the Flask core is also extensible. Therefore, Flask supports an on-demand extension of different functions to meet your personalized needs during web application development. This is to say, with a library of various plug-ins in Flask, you can develop powerful websites.

Flask has the following characteristics:


  1. Flask is a microframework that does not rely on other specific tools or components of third-party libraries to provide shared functionalities. Flask does not have a database abstraction layer and does not require form validation. However, Flask is highly extensible and supports adding application functionality in a way similar to implementations within Flask itself. Relevant extensions include object-relational mappers, form validation, upload processing, open authentication technologies, and some common tools designed for web frameworks.
  2. Flask is a web application framework based on WSGI (Web Server Gateway Interface). WSGI is a simple interface connecting a web server with a web application or framework defined for the Python language.
  3. Flask includes two core function libraries, Werkzeug and Jinja2. Werkzeug is a WSGI toolkit that implements request, response objects, and practical functions, which allows you to build web frameworks on top of it. Jinja2 is a popular full-featured templating engine for Python. It has full support for Unicode, with an optional but widely-adopted integrated sandbox execution environment.
  • FastAPI

FastAPI is a modern Python web application framework that has the same level of high performance as Go and NodeJS. The core of FastAPI is based on Starlette and Pydantic. Starlette is a lightweight ASGI (Asynchronous Server Gateway Interface) framework toolkit for building high-performance Asyncio services. Pydantic is a library that defines data validation, serialization, and documentation based on Python-type hints.

FastAPI has the following characteristics:


Interview With FastAPI Creator – Sebastian Ramirez

Here is a quick recap of the knowledge shared by Sebastián Ramírez, an open-source enthusiast and the creator of FastAPI, Typer, and SQLModel. He has been building products and custom solutions for data and machine learning systems and has led teams of developers around the world. 

Hope you enjoy the interview! Let's go.

The Interview

Question: You have an impressive array of interests: frontend development, backend development, DevOps. What do you think about the "full-stack developer" concept, chased after by most tech companies? Is it a reasonable goal for most developers to pursue, or does it have any downsides?

Answer: Thanks! Yes, I’ve had a lot of interests while solving different problems, and I ended up learning several things from different sub-fields (e.g. backend, frontend, machine learning).

RabbitMQ RPC With FastAPI [Video]

Below, I explain a sample app with a FastAPI endpoint. RabbitMQ is used to deliver and return messages between the API endpoint and the backend. The backend code could run on a different microservice, and multiple backends can be started for scalability.