How Does the Milvus Vector Database Ensure Data Security?

In full consideration of your data security, user authentication and transport layer security (TLS) connection are now officially available in Milvus 2.1. Without user authentication, anyone can access all data in your vector database with SDK. However, starting from Milvus 2.1, only those with a valid username and password can access the Milvus vector database. In addition, in Milvus 2.1 data security is further protected by TLS, which ensures secure communications in a computer network.

This article aims to analyze how Milvus, the vector database ensures data security with user authentication and TLS connection and explain how you can utilize these two features as a user who wants to ensure data security when using the vector database.

What Is Database Security and Why Is It Important?

Database security refers to the measures taken to ensure that all data in the database are safe and kept confidential. Recent data breach and data leak cases at Twitter, Marriott, and Texas Department of Insurance, etc, makes us all the more vigilant to the issue of data security. All these cases constantly remind us that companies and businesses can suffer from severe loss if the data are not well protected and the databases they use are secure.

How Does the Milvus Vector Database Ensure Data Security?

In the current release of 2.1, the Milvus vector database attempts to ensure database security via authentication and encryption. More specifically, on the access level, Milvus supports basic user authentication to control who can access the database. Meanwhile, on the database level, Milvus adopts the transport layer security (TLS) encryption protocol to protect data communication.

User Authentication

The basic user authentication feature in the Milvus vector database supports accessing the vector database using a username and password for the sake of data security. This means clients can only access the Milvus instance upon providing an authenticated username and password.

The Authentication Workflow in the Milvus Vector Database

All gRPC requests are handled by the Milvus proxy; hence authentication is completed by the proxy. The workflow of logging in with the credentials to connect to the Milvus instance is as follows.

Create credentials for each Milvus instance, and the encrypted passwords are stored in etcd. Milvus uses bcrypt for encryption as it implements Provos and Mazières's adaptive hashing algorithm.
On the client side, SDK sends ciphertext when connecting to the Milvus service. The base64 ciphertext (<username>:<password>) is attached to the metadata with the key authorization.
The Milvus proxy intercepts the request and verifies the credentials.
Credentials are cached locally in the proxy.

Authentication Workflow

When the credentials are updated, the system workflow in the Milvus vector database is as follows

..Root coord is in charge of the credentials when insert, query, and delete APIs are called.
When you update the credentials because you forget the password, for instance, the new password is persisted in, etcd. Then all the old credentials in the proxy's local cache are invalidated.
The authentication interceptor looks for the records from local cache first. If the credentials in the cache is not correct, the RPC call to fetch the most updated record from root coord will be triggered. And the credentials in the local cache are updated accordingly.