CockroachDB TIL: Volume 5

CockroachDB TIL: Volume 5


This is my series of articles covering short "Today I learned" topics as I work with CockroachDB.


Previous articles


Topics

Topic 1 Multi-row updates

There's a multi-row insert, upsert and delete. I've not heard of multi-row update. To my surprise, Postgresql support multi-row update with the following syntax: UPDATE FROM. It sends several rows in a single batch, thereby improving performance and reducing network hops.

CREATE TABLE t1 (key INT PRIMARY KEY, val1 STRING, val2 STRING);

INSERT INTO t1 
    (key, val1, val2) 
VALUES 
    (1, 'a', 'b'),
    (2, 'c', 'd'),
    (3, 'e', 'f');

SELECT * FROM t1;
  key | val1 | val2
------+------+-------
    1 | a    | b
    2 | c    | d
    3 | e    | f
UPDATE t1 AS t SET
  val1 = t2.val1,
  val2 = t2.val2
FROM (VALUES
  (1, 'aa', 'bb'),
  (2, 'cc', 'dd'),
  (3, 'ee', 'ff')
) as t2(key, val1, val2)
WHERE t2.key = t.key;
  key | val1 | val2
------+------+-------
    1 | aa   | bb
    2 | cc   | dd
    3 | ee   | ff

It took a few tries to get it right, the following Stack Overflow post had helped to get it right.

The following syntax also works, in other words, aliasing is optional and referecing the table by the actual name

UPDATE t1 
SET
  val1 = t2.val1,
  val2 = t2.val2
FROM (VALUES
  (1, 'aaa', 'bba'),
  (2, 'ccc', 'ddd'),
  (3, 'eee', 'fff')
) as t2(key, val1, val2)
WHERE t2.key = t1.key;
  key | val1 | val2
------+------+-------
    1 | aaa  | bba
    2 | ccc  | ddd
    3 | eee  | fff

Topic 2 Batching statements

Taking example from the previous topic, if the goal is to reduce the number of hops, sending entire batch of updates on a single line also reduces network hops. Single line statement is sent to the gateway as a single batch and CRDB will implicitly batch the entire code block.

  UPDATE t1 SET val1 = 'aaaa', val2 = 'bbbb'
  WHERE key = 1; UPDATE t1 SET val1 = 'cccc', val2 = 'dddd'
  WHERE key = 2; UPDATE t1 SET val1 = 'eeef', val2 = 'ffff'
  WHERE key = 3;
artem@free-tier.gcp-us-central1.cockroachlabs.cloud:26257/defaultdb> UPDATE t1 SET val1 = 'aaaa', val2 = 'bbbb'
  WHERE key = 1; UPDATE t1 SET val1 = 'cccc', val2 = 'dddd'
  WHERE key = 2; UPDATE t1 SET val1 = 'eeef', val2 = 'ffff'
  WHERE key = 3;
UPDATE 1

Note: timings for multiple statements on a single line are not supported. See https://go.crdb.dev/issue-v/48180/v21.1.

As opposed to sending each at a time:

  UPDATE t1 SET val1 = 'aaaaa', val2 = 'bbbbb'
  WHERE key = 1;
  UPDATE t1 SET val1 = 'ccccc', val2 = 'ddddd'
  WHERE key = 2;
  UPDATE t1 SET val1 = 'eeeee', val2 = 'fffff'
  WHERE key = 3;
artem@free-tier.gcp-us-central1.cockroachlabs.cloud:26257/defaultdb> UPDATE t1 SET val1 = 'aaaaa', val2 = 'bbbbb'
  WHERE key = 1;
  UPDATE t1 SET val1 = 'ccccc', val2 = 'ddddd'
  WHERE key = 2;
  UPDATE t1 SET val1 = 'eeeee', val2 = 'fffff'
  WHERE key = 3;

UPDATE 1

Time: 47ms total (execution 5ms / network 41ms)

UPDATE 1

Time: 44ms total (execution 6ms / network 38ms)

UPDATE 1

Time: 43ms total (execution 6ms / network 38ms)

Topic 3 Inspect the binary SST files in CockroachDB

CockroachDB is backed by Pebble storage engine. We recently made a switch from RocksDB to our own RocksDB implementation written in Go. I leave it to you to read the blog above to learn why. The main point of this section is the underlying data in RocksDB and Pebble is stored in files with .sst extension. If you try to browse the file with any standard utility, you won't see much as the file is binary. There are various sst viewers available and we have own our! It is built into the binary, remember CockroachDB comes with all the batteries included! The best part is you can read these files with cluster offline.

I have a CockroachDB directory with some data, let's take a look

cd ./cockroach-data/
tree | grep *.sst
├── 000021.sst

The following command will help us introspect the file

cockroach debug pebble sstable scan 000021.sst --count 10
000021.sst
/Local/RangeID/1/r/RangeGCThreshold/0,0#0,SET : EMPTY

/Local/RangeID/1/r/RangeAppliedState/0,0#0,SET raft_applied_index:123 lease_applied_index:32 range_stats:<last_update_nanos:1642197117283621000 gc_bytes_age:42086 live_bytes:1944 live_count:46 key_bytes:899 key_count:46 val_bytes:1784 val_count:61 sys_bytes:291 sys_count:6 > raft_closed_timestamp:<wall_time:1642197114400256000 > 
/Local/RangeID/1/r/RangeLease/0,0#0,SET repl=(n1,s1):1 seq=1 start=0,0 exp=1642197146.792966000,0 pro=1642197137.792966000,0
/Local/RangeID/1/r/RangePriorReadSummary/0,0#0,SET {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[65 103 5 187 3 10 2 10 0 18 2 10 0] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>}
/Local/RangeID/1/r/RangeVersion/0,0#0,SET 21.2
/Local/RangeID/1/u/RaftHardState/0,0#0,SET term:6 vote:1 commit:123 
/Local/RangeID/1/u/RaftLog/logIndex:11/0,0#0,SET Term:6 Index:11 : EMPTY

/Local/RangeID/1/u/RaftLog/logIndex:12/0,0#0,SET Term:6 Index:12  by lease #0
replicated_eval_result:<write_timestamp:<> delta:<> > 
write batch:
<nil>

/Local/RangeID/1/u/RaftLog/logIndex:13/0,0#0,SET Term:6 Index:13  by lease #0
closed_timestamp:<> replicated_eval_result:<state:<lease:<start:<> expiration:<wall_time:1642196984800220000 > replica:<node_id:1 store_id:1 replica_id:1 > deprecated_start_stasis:<wall_time:1642196984800220000 > proposed_ts:<wall_time:1642196975800220000 > sequence:1 acquisition_type:Request > raft_closed_timestamp:<> > is_lease_request:true write_timestamp:<wall_time:1642196975801033000 > delta:<sys_bytes:76 sys_count:1 > prior_read_summary:<local:<low_water:<> > global:<low_water:<> > > > logical_op_log:<> 
write batch:
Put: 0,0 /Local/RangeID/1/r/RangeLease (0x01698972726c6c2d00): repl=(n1,s1):1 seq=1 start=0,0 exp=1642196984.800220000,0 pro=1642196975.800220000,0
Put: 0,0 /Local/RangeID/1/r/RangePriorReadSummary (0x016989727270727300): {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[65 103 5 187 3 10 2 10 0 18 2 10 0] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>}

/Local/RangeID/1/u/RaftLog/logIndex:14/0,0#0,SET Term:6 Index:14  by lease #1
proposer_lease_sequence:1 max_lease_index:1 closed_timestamp:<wall_time:1642196974801675000 > replicated_eval_result:<write_timestamp:<wall_time:1642196977801415000 > delta:<sys_bytes:57 sys_count:1 > > logical_op_log:<> 
write batch:
Put: 0,0 /Local/Range/Min/QueueLastProcessed/"consistencyChecker" (0x016b120001716c7074636f6e73697374656e6379436865636b657200): {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[112 221 93 101 3 8 240 156 197 200 181 174 144 229 22] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>}

The explanation of the entire contents of the file is beyond the scope of this article but one helpful application of this capability is to confirm native at rest encryption is functioning properly.

I have some data loaded into this cluster, let's look for any data containing "New York"

cockroach debug pebble sstable scan 000021.sst | grep "New York" | tail -n 10
/Table/52/1/"\xfeD\xe7\x1f\x90;H\x00\xbf\"s\x8f\xc8\x1d\xea\xc4"/0/1642197103.541702000,0#0,SET "\xa8\xf5&\xb7\n&\x02NY\x16\rNew York City\x13\xbc\xe9\x81\b"
/Table/52/1/"\xfe{\x90,\xdb\xebH\x00\xbf=\xc8\x16m\xf5\xc4\x00"/0/1642196996.701019000,0#0,SET "\xe5\xa0O\xa7\n&\x02NY\x16\rNew York City\x13\xa4\xa6\xb6\x02"
/Table/52/1/"\xfe\xe9\xec\xdc\xc7KH\x00\xbft\xf6nc\xa5\xeaR"/0/1642196996.701019000,0#0,SET "\xf5\xb2\xce\xf3\n&\x02NY\x16\rNew York City\x13\xa2\xfe\x8a\x04"
/Table/52/1/"\xff\x00\xb7\xdd~\x01H\x00\xbf\x80[\xee\xbf\x00\xfc\xf5"/0/1642197080.834041000,0#0,SET "\xa7\xf9/W\n&\x02NY\x16\rNew York City\x13\x92\xb3\xc2\x04"
/Table/52/1/"\xff\x1a\xae\xfb\xe9\xbdH\x00\xbf\x8dW}\xf4\u07ba$"/0/1642197066.200311000,0#0,SET "\xab\xe6\xdaV\n&\x02NY\x16\rNew York City\x13\xee\xb2\xda\x01"
/Table/52/1/"\xffB\x1d8\x01\xef@\x00\xbf\xa1\x0e\x9c\x00\xf7\xb6\xe9"/0/1642197003.063291000,0#0,SET "\x01\x91\x15\xcc\n&\x02NY\x16\rNew York City\x13\xfa\xa6\xe2\x01"
/Table/52/1/"\xffX\xbaّb@\x00\xbf\xac]lȱ>\x10"/0/1642196996.701019000,0#0,SET "\xbb\xfb\f\xd2\n&\x02NY\x16\rNew York City\x13\x86\x92\xab\x04"
/Table/52/1/"\xff\xc60\xef+AH\x00\xbf\xe3\x18w\x95\xa0\xd4C"/0/1642196996.701019000,0#0,SET "\xfc!\xf2\xd6\n&\x02NY\x16\rNew York City\x13\xa2\xb9\xe8\a"
/Table/52/1/"\xff\xe8>\x1c2\xde@\x00\xbf\xf4\x1f\x0e\x19o8\xa6"/0/1642197052.306718000,0#0,SET "\x9f\x80l\xf6\n&\x02NY\x16\rNew York City\x13Æ£\xcb\x02"
/Table/52/1/"\xff\xf7n\x8b\xb5\xda@\x00\xbf\xfb\xb7E\xda\xed)4"/0/1642197109.857052000,0#0,SET "-\xd1:\xec\n&\x02NY\x16\rNew York City\x13\xaa\x91\x93\x03"

Topic 4 Save Docker containers as files

This topic is not particularly about CockroachDB as much as the ecosystem. Once in a while you get customer inquiries about strict compliance rules preventing these organizations from executing typical workflows we take for granted. Case in point, a financial services organization has rigid guidelines for container image lifecycle. In a nutshell, they have to go through a process of fetching images from public registries. Instead they maintain their own regulated registry which they populate with images their internal team deems safe. This particular customer has no means of pulling our Kubernetes Operator. There are several ways you can fetch our Kubernetes artifacts, either through the Docker Hub or using our github repo. For the sake of accelerating path to success for this customer as well as my genunine curiosity, I decided to see what are the possibilities. Little did I know that this capability is built-in and fairly straightfoward. The magic is in docker save command. I was able to quickly stitch together a runbook for this particular customer to follow and at the same time learn a new trick with Docker! Without further ado:

docker pull cockroachdb/cockroach-operator:v2.5.0
docker save cockroachdb/cockroach-operator:v2.5.0 | gzip > cockroach-operator.tar.gz
docker pull cockroachdb/cockroach:v21.2.4
docker save cockroachdb/cockroach:v21.2.4 | gzip > cockroachdb.tar.gz

The first command pulls the tagged version of our operator. The second command saves the container and pipes it as stdout to gzip utility which then compresses the image to a file. Since we're going to need CockroachDB for this to work as well, I repeat the steps for that container.

So now that you exported the images to a file, the way to load them back is with docker load command.

docker image load -i cockroachdb.tar.gz
e8228e50fe18: Loading layer    107MB/107MB
4716779a2c02: Loading layer  20.48kB/20.48kB
2da1a510186d: Loading layer  16.61MB/16.61MB
1a93f3a5fad3: Loading layer  5.632kB/5.632kB
0dc09568edcd: Loading layer  234.8MB/234.8MB
bbcfddf195ea: Loading layer  153.1kB/153.1kB
884cf86fb787: Loading layer   12.6MB/12.6MB
Loaded image: cockroachdb/cockroach:v21.2.4
docker image load -i cockroach-operator.tar.gz
d77bd8c80675: Loading layer  85.11MB/85.11MB
0237df4acf88: Loading layer  20.48kB/20.48kB
c82cef4821d2: Loading layer  211.5MB/211.5MB
d74de946f1f0: Loading layer  51.91MB/51.91MB
Loaded image: cockroachdb/cockroach-operator:v2.5.0

Topic 5 Evaluate healthcheck options for CockroachDB

Remember my TIL 3 post on pg_isready? Well, as part of my investigation of Unleash and CockroachDB I stumbled on the Unleash docker compose file where they use a healthcheck for Postgresql.

  db:
    expose:
      - "5432"
    image: postgres:13
    environment:
      POSTGRES_DB: "db"
      POSTGRES_HOST_AUTH_METHOD: "trust"
    healthcheck:
      test: ["CMD", "pg_isready", "--username=postgres", "--host=127.0.0.1", "--port=5432"]
      interval: 2s
      timeout: 1m
      retries: 5
      start_period: 10s

Since we know pg_isready works with CockroachDB, wouldn't it be nice to use it for healthcheck with containers as well? The problem stems from our CockroachDB image not having pg_isready installed. We can build a new image that includes cockroach binary as well as pg_isready but that goes against the mantra having slim images. We do however expose a health check endpoint we can use in our healthcheck. I created a quick docker compose file to illustrate the scenario

version: '3.9'

services:

  roach-0:
    container_name: roach-0
    hostname: roach-0
    image: cockroachdb/cockroach:v21.2.4
    command: start-single-node --insecure
    ports:
      - 26257:26257
      - 8080:8080
    healthcheck:
      test: ["CMD", "curl", "http://roach-0:8080/health?ready=1"]
      interval: 2s
      timeout: 1m
      retries: 5
      start_period: 10s

you can now monitor your containers and leverage healthchecks with them.

Comments

Popular posts from this blog

Running CockroachDB with Docker Compose and Minio, Part 2

VirtualBox options to start VM in Normal, Detached and Headless Modes

Digsby is bringing out a Linux and Mac client very soon