Migrate a SQL Server Database to CockroachDB

The fastest way to get started with SQL Server is via available Docker containers. I’m using the following tutorial to deploy SQL Server on Ubuntu from my Mac. My SQL Server-Fu is a bit rusty and I opted for following this tutorial to restore WideWordImporterssample database into my Docker container. You may also need SQL Server tools installed on your host and you may find direction for Mac OS and Linux at the following site, users of Windows are quite familiar with a download location for their OS. 

I also used the following directions to install SQL Server tools on my Mac but ran into compatibility issues with the drivers in my Docker container. This will be a debug session for another day.

Run SQL Server in Docker

Pull the SQL Server image:

docker pull mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04


CockroachDB TIL: Volume 4

Topics

Topic 1: Reset Network and Execution Statistics in CLI

This topic will cover what trivial misconfigurations may lead to if you don't keep client and server versions uniform. These mismatched versions may lead to unforeseen consequences. This is one of those situations where a user reported missing info from their CLI. In this case, they were missing execution and network latency which we conveniently output after every query.

 
artem@free-tier.gcp-us-central1.cockroachlabs.cloud:26257/defaultdb> select * from results limit 1;

                   id                  |   city    |  type   |               owner_id               |    creation_time    | status |   current_location    |       ext

---------------------------------------+-----------+---------+--------------------------------------+---------------------+--------+-----------------------+-------------------

  aaaaaaaa-aaaa-4800-8000-00000000000a | amsterdam | scooter | c28f5c28-f5c2-4000-8000-000000000026 | 2019-01-02 03:04:05 | in_use | 62609 Stephanie Route | {"color": "red"}

(1 row)



Time: 49ms total (execution 2ms / network 48ms)

Specifically, the last line of the output above.

CockroachDB TIL: Volume 3

Previous Articles

Topic 1: Force Lookup Join Instead of Merge or Hash Join

To make the optimizer prefer lookup joins over merge or hash joins when performing foreign key checks, set the prefer_lookup_joins_for_fks session variable to on (merge joins are the default for single-row inserts and hash joins are likely to appear with more rows. There are times when a lookup join will be more efficient than the chosen plan by the optimizer. The lesson here is to test all available options and not to accept the defaults!

set prefer_lookup_joins_for_fks=on;

How To Use CockroachDB With Your Django Application

This tutorial is intended to be a quick ramp-up on CockroachDB with Django.  In case you're searching for a proper Django tutorial, this is not it. At the time of writing, django-cockroachdb library is available in two versions, (2 and 3). This tutorial will cover version 3, and is inspired by the Digital Ocean tutorial using Django with PostgreSQL. I am going to highlight the steps where this tutorial differs from the original. For everything else, we will assume the tutorial is followed as is.

I originally wrote this post two years ago and had since updated it as CockroachDB RBAC is no longer an enterprise-only feature so we can skip that step. I'm also including steps to launch a Docker instance to make this more portable and comprehensive.

CockroachDB TIL: Volume 1

This is a new series of articles covering short "Today I Learned" topics as I peel the layers of CockroachDB. This is meant to resemble release notes. I decided to mix it up with the format for these posts as they may not justify an entire blog.  Link here for past articles.

Topics

Topic 1: Unit Testing With CockroachDB

I get this question a lot:  "What is the best way to unit test CockroachDB as part of application development?"  Depending on the use case, there are several approaches we can take.  There is a venerable [cockroach demo](https://www.cockroachlabs.com/docs/stable/cockroach-demo.html), which has been my trusty go-to tool when I needed something in a hurry.  It comes with a one-hour license for enterprise features, which you can also unit test, but it also comes with certain limitations. For example, there is reported latency at table creation, ports until recently were random and not configurable, and by default it loads data which you can also overcome with a flag. The goal for cockroach demo is to quickly spin up a working environment for demo purposes and not necessarily something designed for unit testing.

Three-Headed Dog Meets Cockroach: CockroachDB With MIT Kerberos

CockroachDB is a cloud-native distributed database that works across various cloud, hybrid, and on-premise environments. The flexibility of deployments demands varying degrees of security protocols. Most of the time, on-premise customers won't accept anything less than Kerberos for their system of record authentication mechanisms. In my Hadoop time, that was the bare minimum requirement to play. CockroachDB today supports Kerberos via GSSAPI for authentication. 

In this post, I'm going to walk you through setting up Kerberos for CockroachDB and provide a sort of cheat sheet, to make this process more seamless. I'm using a single Centos VM provisioned with Vagrant. It servers as my KDC as well as my CockroachDB instance. On to the setup. The following documents may assist in the entire process: CockroachDB GSSAPI, how to install CockroachDB and configuring CockroachDB for secure access. I recorded the entire process with Asciinema, I split the screencast into two parts.

Connect JetBrains DataGrip to a Secure CockroachDB Cluster Running in Docker

CockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters. CockroachDB is built with containers as a first-class citizen. It works well in baremetal, VM, and container environments like Kubernetes, Docker Swarm, and HashiCorp Nomad. One of my customers has a requirement to connect JetBrains DataGrip to a secure cluster. This short tutorial will step through the process of wiring this together using a multi-node CockroachDB cluster running in Docker, a load balancer container based on HAProxy and an early access version of DataGrip.

High-Level Steps

  • Start a three-node CockroachDB cluster in Docker
  • Connect DataGrip to the secure cluster
  • Verify Setup
  • Clean up

Step-by-Step Instructions

Start Cockroach in Docker, you can follow directions in our docs for an insecure cluster or use my repo to deploy a secure cluster in Docker compose.

Working With Multi-Level JSON in CockroachDB

Motivation

I had a customer inquiring about whether CockroachDB is capable of working with multi-level JSON. Considering their JSON would have up to 3 levels of hierarchy, they wanted to know whether CockroachDB is able to use native capability to access data multiple levels down. This prompted my interest and led to this tutorial. Surprisingly, CockroachDB does not inhibit any limitations to the number of levels in hiearchy and performance can be improved using various optimizations also discussed below.

Start a Single Node Instance With Max SQL Memory Flag and Connect to It

Java
 
cockroach start-single-node --max-sql-memory=.25 --insecure --background
cockroach sql --insecure --host=localhost


Recover From a Disaster Using a userfile Backup

CockroachDB supports enterprise-grade backup and restores using object storage targets. For local development, a userfile scheme was introduced to allow fast prototyping without a heavy burden of spinning up costly infra. A customer requested the ability to quickly validate whether they can recover from a complete teardown of a cluster without using object storage. This is my experience and current workaround to get this to work.

Motivation

This tutorial takes a short detour compared to my other articles due to a specific requirement that we need to recover from a local disaster into a new cluster. userfile is very helpful but it assumes the cluster is up and restore can proceed as intended. However, when you intend to shut down the source cluster, remove all of the underlying storage along with the user space where userfile stores its backups, userfile stops being useful. In that case, we have to download the backups out of the user space prior to shutting down the cluster and removing the data volumes.

Recover From an Oops With CockroachDB

This tutorial demonstrates the ability to recover from accidental truncate or table drop.

Motivation

This scenario came out of a real-world situation where I got paged in the early hours of the day by a contractor evaluating our product on behalf of a customer. The contractor was in panic and concerned that she'd just lost four days worth of work by executing an older version of her application which included a hardcoded TRUNCATE statement. There was no backup available either. We were faced with a situation where we were going to miss customer timeline and lose the precious time reloading the data. We are talking about a multi-TB dataset. Once I had some coffee and we huddled to disucss next steps, we agreed the best next option is to leverage CockroachDB capability where dropped data does not leave the filesystem until garbage collection is complete, which by default is set to 25hours and luckily for us, that property was not changed in the cluster. Needless to say, we were able to recover the lost data following the procedure below.

Using CockroachDB Storage Attributes for Heterogeneous Data

Motivation

CockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters. It makes building geo-distributed databases easy. Ability to anchor data to geographic localities is a unique capability of CockroachDB. Today, I will demonstrate this ability to extend beyond regions, availability zones, nodes, data centers and racks. The architecture allows us to domicile data to specific storage devices using storage attributes.

High-Level Steps

  • Start a CockroachDB cluster with multiple disks per node using storage attributes
  • Create a table pinned to a specific disk type
  • Create another table and verify its location on disk
  • Evict the second table to demonstrate granularity
  • Assign proper storage constraints deterministically
  • Verify Setup

Step by Step Instructions

Java
 
export storepath="${PWD}/workdir"
mkdir -p "${storepath}"


Using PGBouncer With CockroachDB

PGBouncer is a lightweight connection pooler for PostgreSQL. CockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters.

CockroachDB is a PostgreSQL wire compatible database, which means it aims to have tight compatibility with the PG ecosystem. Today, we're going to wire PGBouncer to work with CockroachDB. This article is meant to scratch the surface of possibilities unblocked by PGBouncer with CockroachDB and not meant to be an in-depth overview. We're currently researching this topic and will follow up with official docs on the proper architecture and sizing of PGBouncer and CockroachDB.