How to Make GraphQL and DynamoDB Play Nicely Together

Serverless, GraphQL, and DynamoDB are a powerful combination for building websites. The first two are well-loved, but DynamoDB is often misunderstood or actively avoided. It’s often dismissed by folks who consider it only worth the effort “at scale.”

That was my assumption, too, and I tried to stick with a SQL database for my serverless apps. But after learning and using DynamoDB, I see the benefits of it for projects of any scale.

To show you what I mean, let’s build an API from start to finish — without any heavy Object Relational Mapper (ORM) or GraphQL framework to hide what is really going on. Maybe when we’re done you might consider giving DynamoDB a second look. I think it is worth the effort.

The main objections to DynamoDB and GraphQL

The main objection to DynamoDB is that it is hard to learn, but few people argue about its power. I agree the learning curve feels very steep. But SQL databases are not the best fit with serverless applications. Where do you stand up that SQL database? How do you manage connections to it? These things just don’t mesh with the serverless model very well. DynamoDB is serverless-friendly by design. You are trading the up-front pain of learning something hard to save yourself from future pain. Future pain that only grows if your application grows.

The case against using GraphQL with DynamoDB is a little more nuanced. GraphQL seems to fit well with relational databases partly because it is assumed by a lot of the documentation, tutorials, and examples. Alex Debrie is a DynamoDB expert who wrote The DynamoDB Book which is a great resource to deeply learn it. Even he recommends against using the two together, mostly because of the way that GraphQL resolvers are often written as sequential independent database calls that can result in excessive database reads.

Another potential problem is that DynamoDB works best when you know your access patterns beforehand. One of the strengths of GraphQL is that it can handle arbitrary queries more easily by design than REST. This is more of a problem with a public API where users can write arbitrary queries. In reality, GraphQL is often used for private APIs where you control both the client and the server. In this case, you know and can control the queries you run. With a GraphQL API it is possible to write queries that clobber any database without taking steps to avoid them.

A basic data model

For this example API, we will model an organization with teams, users, and certifications. The entity relational diagram is shown below. Each team has many users and each user can have many certifications.

Relational database model

Our end goal is to model this data in a DynamoDB table, but if we did model it in a SQL database, it would look like the following diagram:

To represent the many-to-many relationship of users to certifications, we add an intermediate table called “Credential.” The only unique attribute on this table is the expiration date. There would be other attributes for each of the tables, but we reduce it to just a name for each for simplicity.

Access patterns

The key to designing a data model for DynamoDB is to know your access patterns up front. In a relational database you start with normalized data and perform joins across the data to access it. DynamoDB does not have joins, so we build a data model that matches how we intend to access it. This is an iterative process. The goal is to identify the most frequent patterns to start. Most of these will directly map to a GraphQL query, but some may be only used internally to the back end to authenticate or check permissions, etc. An access pattern that is rarely used, like a check run once a week by an administrator, does not need to be designed. Something very inefficient (like a table scan) can handle these queries.

Most frequently accessed:

  • User by ID or name
  • Team by ID or name
  • Certification by ID or name

Frequently accessed:

  • All Users on a Team by Team ID
  • All Certifications for a given User
  • All Teams
  • All Certifications

Rarely accessed

  • All Certifications of users on a Team
  • All Users who have a Certification
  • All Users who have a Certification on a Team

DynamoDB single table design

DynamoDB does not have joins and you can only query based on the primary key or predefined indexes. There is no set schema for items imposed by the database, so many different types of items can be stored in a single table. In fact, the recommended best practice for your data schema is to store all items in a single table so that you can access related items together with a single query. Below is a single table model representing our data. To design this schema, you take the access patterns above and choose attributes for the keys and indexes that match.

The primary key here is a composite of the partition/hash key (pk) and the sort key (sk). To retrieve an item in DynamoDB, you must specify the partition key exactly and either a single value or a range of values for the sort key. This allows you to retrieve more than one item if they share a partition key. The indexes here are shown as gsi1pk, gsi1sk, etc. These generic attribute names are used for the indexes (i.e. gsi1pk) so that the same index can be used to access different types of items with different access pattern. With a composite key, the sort key cannot be empty, so we use “#” as a placeholder when the sort key is not needed.

Access patternQuery conditions
Team, User, or Certification by ID  Primary Key, pk=”T#”+ID, sk=”#”  
Team, User, or Certification by nameIndex GSI 1, gsi1pk=type, gsi1sk=name
All Teams, Users, or Certifications  Index GSI 1, gsi1pk=type    
All Users on a Team by IDIndex GSI 2, gsi2pk=”T#”+teamID
All Certifications for a User by IDPrimary Key, pk=”U#”+userID, sk=”C#”+certID
All Users with a Certification by IDIndex GSI 1, gsi1pk=”C#”+certID, gsi1sk=”U#”+userID

Database schema

We enforce the “database schema” in the application. The DynamoDB API is powerful, but also verbose and complicated. Many people jump directly to using an ORM to simplify it. Here, we will directly access the database using the helper functions below to create the schema for the Team item.

const DB_MAP = {
  TEAM: {
    get: ({ teamId }) => ({
      pk: 'T#'+teamId,
      sk: '#',
    }),
    put: ({ teamId, teamName }) => ({
      pk: 'T#'+teamId,
      sk: '#',
      gsi1pk: 'Team',
      gsi1sk: teamName,
      _tp: 'Team',
      tn: teamName,
    }),
    parse: ({ pk, tn, _tp }) => {
      if (_tp === 'Team') {
        return {
          id: pk.slice(2),
          name: tn,
          };
        } else return null;
        },
    queryByName: ({ teamName }) => ({
      IndexName: 'gsi1pk-gsi1sk-index',
      ExpressionAttributeNames: { '#p': 'gsi1pk', '#s': 'gsi1sk' },
      KeyConditionExpression: '#p = :p AND #s = :s',
      ExpressionAttributeValues: { ':p': 'Team', ':s': teamName },
      ScanIndexForward: true,
    }),
    queryAll: {
      IndexName: 'gsi1pk-gsi1sk-index',
      ExpressionAttributeNames: { '#p': 'gsi1pk' },
      KeyConditionExpression: '#p = :p ',
      ExpressionAttributeValues: { ':p': 'Team' },
      ScanIndexForward: true,
    },
  },
  parseList: (list, type) => {
    if (Array.isArray(list)) {
      return list.map(i => DB_MAP[type].parse(i));
    }
    if (Array.isArray(list.Items)) {
      return list.Items.map(i => DB_MAP[type].parse(i));
    }
  },
};

To put a new team item in the database you call:

DB_MAP.TEAM.put({teamId:"t_01",teamName:"North Team"})

This forms the index and key values that are passed to the database API. The parse method takes an item from the database and translates it back to the application model.

GraphQL schema

type Team {
  id: ID!
  name: String
  members: [User]
}
type User {
  id: ID!
  name: String
  team: Team
  credentials: [Credential]
}
type Certification {
  id: ID!
  name: String
}
type Credential {
  id: ID!
  user: User
  certification: Certification
  expiration: String
}
type Query {
  team(id: ID!): Team
  teamByName(name: String!): [Team]
  user(id: ID!): User
  userByName(name: String!): [User]
  certification(id: ID!): Certification
  certificationByName(name: String!): [Certification]
  allTeams: [Team]
  allCertifications: [Certification]
  allUsers: [User]
}

Bridging the gap between GraphQL and DynamoDB with resolvers

Resolvers are where a GraphQL query is executed. You can get a long way in GraphQL without ever writing a resolver. But to build our API, we’ll need to write some. For each query in the GraphQL schema above there is a root resolver below (only the team resolvers are shown here). This root resolver returns either a promise or an object with part of the query results.

If the query returns a Team type as the result, then execution is passed down to the Team type resolver. That resolver has a function for each of the values in a Team. If there is no resolver for a given value (i.e. id), it will look to see if the root resolver already passed it down.

A query takes four arguments. The first, called root or parent, is an object passed down from the resolver above with any partial results. The second, called args, contains the arguments passed to the query. The third, called context, can contain anything the application needs to resolve the query. In this case, we add a reference for the database to the context. The final argument, called info, is not used here. It contains more details about the query (like an abstract syntax tree).

In the resolvers below, ctx.db.singletable is the reference to the DynamoDB table that contains all the data. The get and query methods directly execute against the database and the DB_MAP.TEAM.... translates the schema to the database using the helper functions we wrote earlier. The parse method translates the data back to the from needed for the GraphQL schema.

const resolverMap = {
  Query: {
    team: (root, args, ctx, info) => {
      return ctx.db.singletable.get(DB_MAP.TEAM.get({ teamId: args.id }))
        .then(data => DB_MAP.TEAM.parse(data));
    },
    teamByName: (root, args, ctx, info) =>; {
      return ctx.db.singletable
        .query(DB_MAP.TEAM.queryByName({ teamName: args.name }))
        .then(data => DB_MAP.parseList(data, 'TEAM'));
    },
    allTeams: (root, args, ctx, info) => {
      return ctx.db.singletable.query(DB_MAP.TEAM.queryAll)
        .then(data => DB_MAP.parseList(data, 'TEAM'));
    },
  },
  Team: {
    name: (root, _, ctx) => {
      if (root.name) {
        return root.name;
      } else {
        return ctx.db.singletable.get(DB_MAP.TEAM.get({ teamId: root.id }))
          .then(data => DB_MAP.TEAM.parse(data).name);
      }
    },
    members: (root, _, ctx) => {
      return ctx.db.singletable
        .query(DB_MAP.USER.queryByTeamId({ teamId: root.id }))
        .then(data => DB_MAP.parseList(data, 'USER'));
    },
  },
  User: {
    name: (root, _, ctx) => {
      if (root.name) {
        return root.name;
      } else {
        return ctx.db.singletable.get(DB_MAP.USER.get({ userId: root.id }))
          .then(data => DB_MAP.USER.parse(data).name);
      }
    },
    credentials: (root, _, ctx) => {
      return ctx.db.singletable
        .query(DB_MAP.CREDENTIAL.queryByUserId({ userId: root.id }))
        .then(data =>DB_MAP.parseList(data, 'CREDENTIAL'));
    },
  },
};

Now let’s follow the execution of the query below. First, the team root resolver reads the team by id and returns id and name. Then the Team type resolver reads all the members of that team. Then the User type resolver is called for each user to get all of their credentials and certifications. If there are five members on the team and each member has five credentials, that results in a total of seven reads for the database. You could argue that is too many. In a SQL database this might be reduced to four database calls. I would argue that the seven DynamoDB reads will be cheaper and faster than the four SQL reads in many cases. But this comes with a big dose of “it depends” on a lot of factors.

query { team( id:"t_01" ){
  id
  name
  members{
    id
    name
    credentials{
      id
      certification{
        id
        name
      }
    }
  }
}}

Over-fetching and the N+1 problem

Optimizing a GraphQL API involves balancing a whole lot of tradeoffs that we won’t get into here. But two that weigh heavily in the decision of DynamoDB versus SQL are over-fetching and the N+1 problem. In many ways, these are opposite sides of the same coin. Over-fetching is when a resolver requests more data from the database than it needs to respond to the query. This often happens when you try to make one call to the database in the root resolver or a type resolver (e.g., members in the Team type resolver above) to get as much of the data as you can. If the query did not request the name attribute, it can be seen as wasted effort.

The N+1 problem is almost the opposite. If all the reads are pushed down to the lowest level resolver, then the team root resolver and the members resolver (for Team type) would make only a minimal or no request to the database. They would just pass the IDs down to the Team type and User type resolver. In this case, instead of members making one call to get all five members, it would push down to User to make five separate reads. This would result in potentially 36 or more separate reads for the query above. In practice, this does not happen because an optimized server would use something like the DataLoader library that acts as a middleware to intercept those 36 calls and batch them into probably only four calls to the database. These smaller atomic read requests are needed so that the DataLoader (or similar tool) can efficiently batch them into fewer reads.

So, to optimize a GraphQL API with SQL, it is usually best to have small resolvers at the lowest levels and use something like DataLoader to optimize them. But for a DynamoDB API it is better to have “smarter” resolvers higher up that better match the access patterns your single table database it written for. The over-fetching that results in this case is usually the lesser of the two evils.

Deploy this example in 60 seconds

This is where you realize the full payoff of using DynamoDB together with serverless GraphQL. I built this example with Architect. It is an open-source tool to build serverless apps on AWS without most of the headaches of directly using AWS. Once you clone the repo and run npm install, you can launch the app for local development (including a built-in local version of the database) with a single command. Not only that, you can also deploy it straight to production infrastructure (including DynamoDB) on AWS with a single command when you are ready.


The post How to Make GraphQL and DynamoDB Play Nicely Together appeared first on CSS-Tricks.

You can support CSS-Tricks by being an MVP Supporter.

A Complete State Machine Made With HTML Checkboxes and CSS

State machines are typically expressed on the web in JavaScript and often through the popular XState library. But the concept of a state machine is adaptable to just about any language, including, amazingly, HTML and CSS. In this article, we’re going to do exactly that. I recently built a website that included a “no client JavaScript” constraint and I needed one particular unique interactive feature.

The key to all this is using <form> and <input type="radio"> elements to hold a state. That state is toggled or reset with another radio <input> or reset <button> that can be anywhere on the page because it is connected to the same <form> tag. I call this combination a radio reset controller, and it is explained in more detail at the end of the article. You can add more complex state with additional form/input pairs.

It’s a little bit like the Checkbox Hack in that, ultimately, the :checked selector in CSS will be doing the UI work, but this is logically more advanced. I end up using a templating language (Nunjucks) in this article to keep it manageable and configurable.

Traffic light state machine

Any state machine explanation must include the obligatory traffic light example. Below is a working traffic light that uses a state machine in HTML and CSS. Clicking “Next” advances the state. The code in this Pen is post processed from the state machine template to fit in a Pen. We’ll get into the code in a more readable fashion later on.

Hiding/Showing table information

Traffic lights aren’t the most practical every-day UI. How about a <table> instead?

There are two states (A and B) that are changed from two different places in the design that affect changes all over the UI. This is possible because the empty <form> elements and <input> elements that hold state are at the very top of the markup and thus their state can be deduced with general sibling selectors and the rest of the UI can be reached with descendent selectors. There is a loose coupling of UI and markup here, meaning we can change the state of almost anything on the page from anywhere on the page.

General four-state component

Diagram of a generic four-state finite state machine

The goal is a general purpose component to control the desired state of the page. “Page state” here refers to the desired state of the page and “machine state” refers to the internal state of the controller itself. The diagram above shows this generic state machine with four states(A, B, C and D). The full controller state machine for this is shown below. It is built using three of the radio reset controller bits. Adding three of these together forms a state machine that has eight internal machine states (three independent radio buttons that are either on or off).

Diagram of the controller’s internal states

The “machine states” are written as a combination of the three radio buttons (i.e. M001 or M101). To transition from the initial M111 to M011, the radio button for that bit is unset by clicking on another radio <input> in the same group. To transition back, the reset <button> for the <form> attached to that bit is clicked which restores the default checked state. Although this machine has eight total states, only certain transitions are possible. For instance, there is no way to go directly from M111 to M100 because it requires two bits to be flipped. But if we fold these eight states into four states so that each page state shares two machine states (i.e. A shares states M111 and M000) then there is a single transition from any page state to any other page state.

Reusable four-state component

For reusability, the component is built with Nunjucks template macros. This allows it to be dropped into any page to add a state machine with the desired valid states and transitions. There are four required sub-components:

  • Controller
  • CSS logic
  • Transition controls
  • State classes

Controller

The controller is built with three empty form tags and three radio buttons. Each of the radio buttons checked attribute is checked by default. Each button is connected to one of the forms and they are independent of each other with their own radio group name. These inputs are hidden with display: none because they are are not directly changed or seen. The states of these three inputs comprise the machine state and this controller is placed at the top of the page.

{% macro FSM4S_controller()%}
  <form id="rrc-form-Bx00"></form>
  <form id="rrc-form-B0x0"></form>
  <form id="rrc-form-B00x"></form>
  <input data-rrc="Bx00" form="rrc-form-Bx00" style="display:none" type="radio" name="rrc-Bx00" checked="checked" />
  <input data-rrc="B0x0" form="rrc-form-B0x0" style="display:none" type="radio" name="rrc-B0x0" checked="checked" />
  <input data-rrc="B00x" form="rrc-form-B00x" style="display:none" type="radio" name="rrc-B00x" checked="checked" />
{% endmacro %}

CSS logic

The logic that connects the controller above to the state of the page is written in CSS. The Checkbox Hack uses a similar technique to control sibling or descendant elements with a checkbox. The difference here is that the button controlling the state is not tightly coupled to the element it is selecting. The logic below selects based on the “checked” state of each of the three controller radio buttons and any descendant element with class .M000. This state machine hides any element with the .M000 class by setting display: none !important. The !important isn’t a vital part of the logic here and could be removed; it just prioritizes the hiding from being overridden by other CSS.

{%macro FSM4S_css()%}
<style>
  /* Hide M000 (A1) */
  input[data-rrc="Bx00"]:not(:checked)~input[data-rrc="B0x0"]:not(:checked)~input[data-rrc="B00x"]:not(:checked)~* .M000  {
    display: none !important;
  }

  /* one section for each of 8 Machine States */

</style>
{%endmacro%}

Transition control

Changing the state of the page requires a click or keystroke from the user. To change a single bit of the machine state, the user clicks on a radio button that is connected to the same form and radio group of one of the bits in the controller. To reset it, the user clicks on a reset button for the form connected to that same radio button. The radio button or the reset button is only shown depending on which state they are in. A transition macro for any valid transition is added to the HTML. There can be multiple transitions placed anywhere on the page. All transitions for states currently inactive will be hidden.

{%macro AtoB(text="B",class="", classBtn="",classLbl="",classInp="")%}
  <label class=" {{class}} {{classLbl}} {{showM111_A()}} "><input class=" {{classInp}} " form="rrc-form-Bx00" type="radio" name="rrc-Bx00" />{{text}}</label>
  <button class=" {{class}} {{classBtn}} {{showM000_A1()}} " type="reset" form="rrc-form-Bx00">{{text}}</button>
{%endmacro%}

State class

The three components above are sufficient. Any element that depends on state should have the classes applied to hide it during other states. This gets messy. The following macros are used to simplify that process. If a given element should be shown only in state A, the {{showA()}} macro adds the states to hide.

{%macro showA() %}
  M001 M010 M100 M101 M110 M011
{%endmacro%}

Putting it all together

The markup for the traffic light example is shown below. The template macros are imported in the first line of the file. The CSS logic is added to the head and the controller is at the top of the body. The state classes are on each of the lights of the .traffic-light element. The lit signal has a {{showA()}} macro while the “off” version of signal has the machine states for the .M000 and .M111 classes to hide it in the A state. The state transition button is at the bottom of the page.

{% import "rrc.njk" as rrc %}
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>Traffic Light State Machine Example</title>
  <link rel="stylesheet" href="styles/index.processed.css">
  {{rrc.FSM4S_css()}}
</head>
<body>
  {{rrc.FSM4S_controller()}}
  <div>
    <div class="traffic-light">
      <div class="{{rrc.showA()}} light red-light on"></div>
      <div class="M111 M000 light red-light off"></div>
      <div class="{{rrc.showB()}} light yellow-light on"></div>
      <div class="M100 M011 light yellow-light off"></div>
      <div class="{{rrc.showC()}} light green-light on"></div>
      <div class="M010 M101 light green-light off"></div>
    </div>
    <div>
      <div class="next-state">
        {{rrc.AtoC(text="NEXT", classInp="control-input",
          classLbl="control-label",classBtn="control-button")}}
        {{rrc.CtoB(text="NEXT", classInp="control-input",
          classLbl="control-label",classBtn="control-button")}}
        {{rrc.BtoA(text="NEXT", classInp="control-input",
          classLbl="control-label",classBtn="control-button")}}
      </div>
    </div>
  </div>
</body>
</html>

Extending to more states

The state machine component here includes up to four states which is sufficient for many use cases, especially since it’s possible to use multiple independent state machines on one page.

That said, this technique can be used to build a state machine with more than four states. The table below shows how many page states can be built by adding additional bits. Notice that an even number of bits does not collapse efficiently, which is why three and four bits are both limited to four page states.

Bits (rrcs)Machine statesPage states
122
242
384
4164
5326

Radio reset controller details

The trick to being able to show, hide, or control an HTML element anywhere on the page without JavaScript is what I call a radio reset controller. With three tags and one line of CSS, the controlling button and controlled element can be placed anywhere after this controller. The controlled side uses a hidden radio button that is checked by default. That radio button is connected to an empty <form> element by an ID. That form has a type="reset" button and another radio input that together make up the controller.

<!-- RRC Controller -->
<form id="rrc-form"></form>
<label>
  Show
  <input form="rrc-form" type="radio" name="rrc-group" />
</label>
<button type="reset" form="rrc-form">Hide</button>

<!-- Controlled by RRC -->
<input form="rrc-form" class="hidden" type="radio" name="rrc-group" checked />
<div class="controlled-rrc">Controlled from anywhere</div>

This shows a minimal implementation. The hidden radio button and the div it controls need to be siblings, but that input is hidden and never needs to be directly interacted with by the user. It is set by a default checked value, cleared by the other radio button, and reset by the form reset button.

input[name='rrc-group']:checked + .controlled-rrc {
  display: none;
}
.hidden {
  display: none;
}

Only two line of CSS are required to make this work. The :checked pseudo selector connects the hidden input to the sibling it is controlling. It adds the radio input and reset button that can be styled as a single toggle, which is shown in the following Pen:

Accessibility… should you do this?

This pattern works, but I am not suggesting it should be used everywhere for everything. In most cases, JavaScript is the right way to add interactivity to the web. I realize that posting this might get some heat from accessibility and semantic markup experts. I am not an accessibility expert, and implementing this pattern may create problems. Or it may not. A properly labelled button that does something to the page controlled by otherwise-hidden inputs might work out fine. Like anything else in accessibility land: testing is required.

Also, I have not seen anyone else write about how to do this and I think the knowledge is useful — even if it is only appropriate in rare or edge-case situations.


The post A Complete State Machine Made With HTML Checkboxes and CSS appeared first on CSS-Tricks.

You can support CSS-Tricks by being an MVP Supporter.