Databases For Front-End Developers: The Concepts Under The Hood (Part 2)

In Part 1, The Rise Of Serverless Databases, of the “Databases For Front-End Developers” series, we talked about the hurdles and traps of scaling and maintaining your databases. We went from simpler and specialized alternatives like Content Management Systems and spreadsheets to self-hosted databases and, finally, to Serverless Databases.

Today, we go deeper into the rabbit hole. We will explore concepts to equip you to have your own opinions about which kinds of databases suit your specific needs. And this is important to stress up front: there is no right answer. Each database carries its own set of tradeoffs and advantages. If something looks like a “one-size-fits-all” solution, be careful: you might be missing something!

Anatomy Of A Database

Before we begin, it’s important to highlight that what we loosely refer to as “databases” are actually “Database Management Systems (DBMS).” A DBMS is a piece of software that enables the user to more ergonomically write, read, delete, or update information in a given set of data. For this series, we will focus mainly on Relational and Non-Relational DBMSs. There are many other types, all categorized by their data structures, but relational and non-relational are the most common for web development by any measure.

Both Relational (R) and Non-Relational (NR) DBMS have different terms for the parts that compose them. Such components are almost interchangeable in definition, and that’s why you can commonly hear a developer referring to a Document (NR term) as “Table,” which is its Relational equivalent structure. Don’t be afraid of confusing them; they appear often enough for this cognitive overload to disappear quickly with usage. Additionally, once you get more familiar with the differences of each data structure, you will realize they probably shouldn’t be used interchangeably because there are differences among them. But for now, and for the sake of simplicity, let’s focus on the similarities:

  • Schema
    It’s the description of the database (like a blueprint), describing the landscape of the database in a supported language. This is required by relational databases, and though not required by non-relational DBMS, many interfaces offer a possibility to define it too.
  • Tables (R/NR), Collections (NR)
    It’s the logical data structure, unsorted. A relatable example of this would be a spreadsheet table or a group (collection) of JSON objects.
  • Databases (R/NR)
    It’s the logical grouping of data. You group your tables in databases (user database, invoices databases) and your documents in indexes based on how you intend to query them.

Keys And Columns

To use a more familiar example, let’s take a JSON object as an example:

{
  "name": "Atila"
  "role": "DX Engineer at Xata"
}

Given that JSON, a column would be represented by the key-value pair (column name has value “Atila”), and the key follows the same meaning as the JSON spec, key name will access the value “Atila.”

A table has columns, and each column’s name determines the key with which you can access a value in a record.

In addition to the above definition, there are special kinds of keys. Such keys play a special role in the schema of your database and how you will interact with your data. Define these wisely: any minimal change to them can be considered a breaking change to your data layer.

  • Unique Keys (UK)
    The keys that are unique between records on a table. They also accept null.
  • Primary Keys (PK)
    It’s a special kind of Unique Key. There can only be one Primary Key per table, and it can never be null (Primary Keys are always considered required). Primary Keys are also indexed by default, helping with queries that want to filter on the value.
  • Foreign Keys (FK)
    When there is a relation between tables, the keys are represented as foreign keys at the extraneous table.

Example of Foreign Key: if each author in an authors table has posts (and posts is another table), post.title will be a FK at authors. And the junction between authors and posts is called a relation.

Mapping Your DBMS

Once you look at the different data structures and choose your DBMS, you are ready to draw the first connection from your data layer to your application layer. And suddenly, you noticed it isn’t quite straightforward to bring data from the database to your client-side (or even to your server-side API in some cases).

Here comes ORM (Object Relational Mapping) and ODM (Object Document Mapping) to assist your developer experience. Prisma is probably the most widely used ORM at the moment, while Mongoose has the largest ODM user base. It’s important to note they are not a requirement for connecting to databases. Still, if you keep an eye on how they build your queries (some specific cases can present performance issues on the account of the abstraction), they tend to make your life easier and fetching or writing data much more ergonomic.

When it comes to serverless databases, the need for them becomes a bit more questionable. And this is because many of these databases provide the users with an officially supported Software Development Kit (SDK). Your mileage will vary depending on the SDK, but they tend to have a big feature overlap with ORMs and ODMs, especially on databases that will keep the data layer behind an API (Xata, for example). This way, you won’t need to worry about translating your queries, and you can demand equivalent ergonomics between the SDK and an ORM.

Common Concepts

In this part of our series, we are learning what to look for when choosing the stack for our data layers. It is essential to understand common concepts around maintaining and choosing a DBMS (from here on out, we’re back into referring to them as “databases” to remain consistent with the rest of the world).

The next sections will not go so deep that you jump out and find a job as Database Administrator (DBA), but hopefully, it will offer you enough ammo to engage in conversation with experts and identify the best solution for your use cases. These concepts are common for every kind of data layer, from a spreadsheet to a self-hosted database and even to a serverless database. What will vary is how each solution will balance the variables intertwined in these paradigms.

As Thanos (from Marvel: Infinity Wars) would say: “perfectly balanced, like all things should be.”

The most important concepts, to begin with, are Consistency, Availability, and Partition Tolerance. They’re better understood if presented together because the balance between them will guide how predictable your data is in different contexts.

CAP Theorem

This theorem describes the relationship between 3 components in a distributed system: Consistency, Availability, and Partition tolerance (C.A.P). The overall conclusion is easy to summarize: any system is only able to contemplate 2 of these components at the same time. Though just a simple sentence, this idea requires a bit of unpacking.

Consistency (C)

Within the CAP Theorem constraints, “consistency” refers directly to the data. When different clients make the same request, they will get the same response. When a written request is accepted and confirmed, all users will have access to this updated information at the same time.

Availability (A)

Every request will receive a response with data. No errors. However, this comes with no promises on whether the data is up to date or not. Depending on whether this is paired with consistency (C) or partition tolerance (P), you will get different behavior.

Partition Tolerance (P)

A “partition” is when the connection between two nodes in a system is broken. The CAP Theorem essentially describes how a system will tolerate the partition: enforcing availability (AP system) or enforcing consistency (CP system). Regardless of how small a system is, partitions can always happen. Therefore there is no such thing as a CA system.

To provide a more graphical example of each system, we can consider:

  • AP System
    Another node has a copy of the data. The user will get information back but without promises of it being 100% correct (up-to-date). It’s commonly implemented with DynamoDB, Cassandra, and so on.
  • CP System
    To return an error and accept the transaction is impossible. It’s commonly implemented with traditional sharded DBs, Citus, for example.

Though popular and very often referred to, the CAP Theorem can be considered incomplete because it doesn’t consider situations beyond the network partition. Especially today, with high-availability content delivery networks (CDN), and extreme connectivity, it’s crucial to take into consideration latency and deeper aspects of consistency (linearizability and serializability).

Consistency Confusion

It’s funny how “consistency” is a term that switches meanings based on the context. So, in the CAP Theorem, as we just saw, it’s all about data and how reliably a user will get the same response to the same request regardless of which partition they reach. Once we take a step away from our own system, a new perspective makes us ask: “how consistently will we handle concurrent operations?”. And just like that, the CAP Theorem does not sufficiently describe the intricacies of operating at scale.

There is a third meaning of “consistency,” which we will save for later; it’s part of yet another acronym (ACID). If the last paragraph left you scratching your head, I can’t recommend enough “Inconsistent Thoughts on Database Consistency” by Alex DeBrie.

PACELC Theorem

The first three letters are the same as CAP, just reordered. The “consistency” in the PACELC Theorem goes deeper than in the CAP Theorem. It follows the Consistency Model, which determines the contract between a data store and a system. To make things less complicated, consider the PACELC an extension of the CAP Theorem.

Beyond asking the developer to strategize for the event of a network partition, the PACELC also considers what happens when the system has no partitions (healthy network).

ELC stands for Else: Latency or Consistency?

  • Latency: Can you accept an occasional stale response for the sake of performance?
  • Consistency:
    • Linearizability: Will you accept a higher latency to sync data in all nodes of a network before considering the transaction done?
    • Serializability: In the event of concurrent transactions on the same data, will you handle them in parallel or in a queue?

Because of this, I consider the PACELC carries a better mental model for this new era of Serverless Databases. When healthy, it’s not a scalar classification “AP” or “CP”; it accepts a spectrum because latency can be high or not, and consistency can have different levels as well.

Once we start talking about the types of databases and their data structures and which guarantees each can give, we will also talk about a bit of system architecture and how your architecture can reduce the tradeoffs when even by enforcing consistency, you can strive for a low latency scenario.

See You Next Time

With that, I believe we are ready to start narrowing down our discussions on the differences between each database type. Moreover, it’s possible to discuss and level expectations on each solution taken and analyze architectures individually. From now on, we will focus on Relational and Non-Relational databases.

In a few days, on our season finale, we will cover the differences between NoSQL and SQL: what guarantees to expect, what that means to your data, and how that affects development workflow. Then we will be ready to jump right into Serverless Databases and what to expect going forward. I can’t wait!

As usual, feel free to reach out to me with feedback, questions, and/or requests for the next part.

About the “Warning! WordPress Encrypts User Cookies” Error

Upgrading from older versions of WordPress is designed to go without a hitch, but depending on the setup and the two versions involved, you may encounter some hangups along the way. For example, if you are upgrading from a version of WordPress older than 3.0, eventually you may encounter the dreaded "Warning! WordPress Encrypts User Cookies" error. This quick DigWP tutorial explains what it is, why it happens, and how to fix the problem asap.

The error message

For those who are experiencing this "encrypted cookie" issue, the error message that's displayed looks something like this:

Warning! WordPress encrypts user cookies ...

This error happens when trying to log in or when you try to do things in the Admin Area. Basically you keep getting logged out for no apparent reason.

Why it happens

Fortunately there is an easy solution for the "WordPress Encrypts User Cookies" error. Open your site's wp-config.php file. Scroll down the file to just after the database credentials. Depending on your version of WordPress, you should find something like this:

/**#@+
 * Authentication Unique Keys and Salts.
 *

..followed by a set of 3, 4, or 8 (depending on WP version) constant definitions. For example, in the latest version of WordPress (5.0), there are EIGHT Unique Keys and Salts, waiting to be filled with random characters:

define('AUTH_KEY',         'put your unique phrase here');
define('SECURE_AUTH_KEY',  'put your unique phrase here');
define('LOGGED_IN_KEY',    'put your unique phrase here');
define('NONCE_KEY',        'put your unique phrase here');
define('AUTH_SALT',        'put your unique phrase here');
define('SECURE_AUTH_SALT', 'put your unique phrase here');
define('LOGGED_IN_SALT',   'put your unique phrase here');
define('NONCE_SALT',       'put your unique phrase here');

The problem is that the number of these keys has changed along with WordPress. For those with better things to do, here is a brief history:

WordPress < 2.6

WP 2.6 has no secret keys:

[ none ]

WordPress 2.6

WP 2.6 has three secret keys:

define('AUTH_KEY',        'put your unique phrase here');
define('SECURE_AUTH_KEY', 'put your unique phrase here');
define('LOGGED_IN_KEY',   'put your unique phrase here');

WordPress 2.7 — 2.9

WP 2.7 thru 2.9 have four secret keys:

define('AUTH_KEY',        'put your unique phrase here');
define('SECURE_AUTH_KEY', 'put your unique phrase here');
define('LOGGED_IN_KEY',   'put your unique phrase here');
define('NONCE_KEY',       'put your unique phrase here');

WordPress >= 3.0

Versions of WP greater than or equal to 3.0 have eight secret keys:

define('AUTH_KEY',         'put your unique phrase here');
define('SECURE_AUTH_KEY',  'put your unique phrase here');
define('LOGGED_IN_KEY',    'put your unique phrase here');
define('NONCE_KEY',        'put your unique phrase here');
define('AUTH_SALT',        'put your unique phrase here');
define('SECURE_AUTH_SALT', 'put your unique phrase here');
define('LOGGED_IN_SALT',   'put your unique phrase here');
define('NONCE_SALT',       'put your unique phrase here');

What does this mean? It means that when you upgrade from an older version of WordPress, the number of Unique Keys and Salts may not be the same. And so, if the latest version of WordPress is expecting eight secret-key constants, but your site's wp-config.php only contains four constants, you're gonna get the "WordPress Encrypts User Cookies" error.

The solution

To resolve the "encrypts cookie" error, you need to update your site's Unique Keys and Salts (secret keys), so as to provide the correct number of key constants. So if your old WP site only has three key constants, and you upgrade to WordPress 5.0, you will need to add the five missing constants (for a total of eight), so that WordPress can operate normally and without error.

Example: Upgrade from any version of WP, to the latest version of WP

If you are upgrading from any version of WP to the latest version, you can fix the error by simply replacing your existing secret keys with a brand new set. To do so, visit the WordPress Keys & Salts Generator, copy the results, and replace your existing keys with the freshly generated code. Then save changes, upload to your server and done. Once the new, complete set of keys is added, the encrypted-cookie error will disappear.

Other upgrade paths

As explained previously, your site's wp-config.php file should have the same number of constants that is expected by WordPress. Although ideally everyone everywhere always would update to the latest version of WordPress, we know that's just not a realistic expectation.

So for any other "non-latest" upgrade path that you may be taking, just make sure that your new version of WordPress has the correct number of secret keys defined. Check out the previous section for a list of WP versions and their respective number of Unique/Key salts.