CockroachDB TIL: Volume 6

This is my series of articles covering short "Today I learned" topics as I work with CockroachDB.
Previous articles
Topics
- Topic 1: Cockroach Init container
- Topic 2: Chain column's schema changes into a single statement
- Topic 3: Turn off fsync (UNSAFE)
- Topic 4: Use
pgpasswith Javapgjdbcdriver - Topic 5: Avoid cruft with
cockroach demo
Topic 1: Cockroach Init container
Use init container to initialize Cockroach and exit, you no longer need to explicitly run init when you bring up CockroachDB.
version: '3.9'
services:
roach-0:
container_name: roach-0
hostname: roach-0
image: cockroachdb/cockroach:v21.2.4
command: start --logtostderr=WARNING --log-file-verbosity=WARNING --insecure --join=roach-0,roach-1,roach-2 --listen-addr=roach-0:26257 --advertise-addr=roach-0:26257 --max-sql-memory=.25 --cache=.25
ports:
- 26257:26257
- 8080:8080
healthcheck:
test: ["CMD", "curl", "http://roach-0:8080/health?ready=1"]
interval: 2s
timeout: 1m
retries: 5
start_period: 10s
roach-1:
container_name: roach-1
hostname: roach-1
image: cockroachdb/cockroach:v21.2.4
command: start --logtostderr=WARNING --log-file-verbosity=WARNING --insecure --join=roach-0,roach-1,roach-2 --listen-addr=roach-1:26257 --advertise-addr=roach-1:26257 --max-sql-memory=.25 --cache=.25
ports:
- 26258:26257
- 8081:8080
healthcheck:
test: ["CMD", "curl", "http://roach-1:8080/health?ready=1"]
interval: 2s
timeout: 1m
retries: 5
start_period: 10s
roach-2:
container_name: roach-2
hostname: roach-2
image: cockroachdb/cockroach:v21.2.4
command: start --logtostderr=WARNING --log-file-verbosity=WARNING --insecure --join=roach-0,roach-1,roach-2 --listen-addr=roach-2:26257 --advertise-addr=roach-2:26257 --max-sql-memory=.25 --cache=.25
ports:
- 26259:26257
- 8082:8080
healthcheck:
test: ["CMD", "curl", "http://roach-2:8080/health?ready=1"]
interval: 2s
timeout: 1m
retries: 5
start_period: 10s
init:
image: cockroachdb/cockroach:v21.2.4
command: init --insecure --host=roach-0
So when docker compose comes up, you can observe the following
oach-1 | *
roach-1 | * INFO: initial startup completed.
roach-1 | * Node will now attempt to join a running cluster, or wait for `cockroach init`.
roach-1 | * Client connections will be accepted after this completes successfully.
roach-1 | * Check the log file(s) for progress.
roach-1 | *
5-init-1 | Cluster successfully initialized
roach-0 | W220121 20:03:55.536091 72 2@gossip/gossip.go:1486 ⋮ [n?] 4 no incoming or outgoing connections
5-init-1 exited with code 0
roach-2 | W220121 20:03:55.781539 48 2@gossip/gossip.go:1486 ⋮ [n?] 4 no incoming or outgoing connections
roach-0 | CockroachDB node starting at 2022-01-21 20:03:55.7840506 +0000 UTC (took 1.1s)
roach-0 | build: CCL v21.2.4 @ 2022/01/10 18:50:15 (go1.16.6)
roach-0 | webui: http://roach-0:8080
roach-0 | sql: postgresql://root@roach-0:26257/defaultdb?sslmode=disable
roach-0 | sql (JDBC): jdbc:postgresql://roach-0:26257/defaultdb?sslmode=disable&user=root
roach-0 | RPC client flags: /cockroach/cockroach <client cmd> --host=roach-0:26257 --insecure
specifically
5-init-1 | Cluster successfully initialized
5-init-1 exited with code 0
Topic 2: Chain column's schema changes into a single statement
Considering the following two schema:
CREATE TABLE tbl (
key UUID DEFAULT gen_random_uuid() PRIMARY KEY,
col1 STRING,
col2 STRING
);
change statements, it is more efficient to execute the two statements on a single line. In that way, CockroachDB will execute the statements in a single hop.
ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL;
ALTER TABLE tbl ALTER COLUMN col1 SET DEFAULT '';
You may wrap the two statements in an explicit transaction and gain some efficiency in terms of executing the batch as a single block.
BEGIN;
ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL;
ALTER TABLE tbl ALTER COLUMN col1 SET DEFAULT '';
COMMIT;
Perhaps there's a better way, we can rewrite the schema change statement as a single statement and add the subsequent column changes including varying columns separated by comma.
ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;
Topic 3: Turn off fsync (UNSAFE)
I was on the fence to include the next tip as there are severe implications in production environments but for local development where data is not relevant, it can yield some efficiency and performance. CockroachDB relies on fsync to flush data on disk, no surprise it is also slow. Considering the example from topic 2:
ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;
output of the command:
root@:26257/test> ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;
ALTER TABLE
Time: 382ms total (execution 382ms / network 0ms)
The execution stats for this statement do not indicate any slowness. Considering I'm using a Macbook, our engineering suggested to turn off fsync, and also considering the implications!
SET CLUSTER SETTING kv.raft_log.disable_synchronization_unsafe = true;
root@:26257/test> SET CLUSTER SETTING kv.raft_log.disable_synchronization_unsafe = true;
SET CLUSTER SETTING
Time: 111ms total (execution 111ms / network 0ms)
root@:26257/test> ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;
ALTER TABLE
Time: 29ms total (execution 29ms / network 0ms)
And things are instantly much faster. Again, please use this carefully! You can read more about it here
Topic 4: Use pgpass with Java pgjdbc driver
Pgjdbc release 42.3.0 added an option to read the contents of pgpass file and or environment variable PGPASSFILE. That makes me very happy as any chance I can secure a password from prying eyes is cool in my book.
What we will need is to set up pgpass file, you can find instructions in my 2nd TIL article, topic 2. At this point, we need a Java program. Let's use the sample Java app in our docs. Specifically, we're going to use the JDBC/Serverless steps. These steps will also walk you through setting up a serverless cluster we are going to use for this topic.
git clone https://github.com/cockroachlabs/example-app-java-jdbc/
cd example-app-java-jdbc
git checkout cockroachcloud
Go to the initialize the database step and run the command pointing to your cluster
cat app/src/main/resources/dbinit.sql | cockroach sql --url "<connection-string>"
I saved the connection string in an environment variable $COCKROACH_URL for simplicity, as described in the TIL volume 2, topic 1.
cat app/src/main/resources/dbinit.sql | cockroach sql --url $COCKROACH_URL
Before we run the code, we need to update the connection parameters. Unlike the tutorial in the docs, we're going to intentionally leave out the password property.
ds.setServerNames(new String[]{"free-tier14.aws-us-east-1.cockroachlabs.cloud"});
ds.setPortNumbers(new int[]{26257});
ds.setDatabaseName("artem-serverless-480.bank");
ds.setSsl(true);
ds.setUser("artem");
//ds.setPassword("{password}");
Finally, we need to update the dependency file to use the latest pgjdbc driver, which at the time of writing is 42.3.3.
you can find the file in ./example-app-java-jdbc/app/build.gradle.
dependencies {
// This dependency is used by the application.
implementation 'com.google.guava:guava:29.0-jre'
- implementation 'org.postgresql:postgresql:42.2.18'
+ implementation 'org.postgresql:postgresql:42.3.3'
}
At this point, we're ready to run the code. The driver will either read the contents of the pgpass file, property -Dorg.postgresql.pgpassfile=filename or PGPASSFILE= environment variable. Let me show you each way:
This is a gradle project, passing an environment variable to the project looks like so:
PGPASSFILE=/tmp/.pgpass ./gradlew run
I'm using a temporary directory to illustrate the point, by default, the driver will read the location of $HOME/.pgpass.
> Task :app:run
com.cockroachlabs.BasicExampleDAO.updateAccounts:
'INSERT INTO accounts (id, balance) VALUES ('fc4da00e-1240-4578-9bdb-aaf873c61ba8', 250)'
com.cockroachlabs.BasicExampleDAO.updateAccounts:
'INSERT INTO accounts (id, balance) VALUES ('e469ce3a-2352-47a7-b79e-a8f7fa3eb6ab', 1000)'
BasicExampleDAO.updateAccounts:
=> 2 total updated accounts
main:
=> Account balances at time '17:32:20.237880':
ID 1 => $1000
ID 2 => $250
com.cockroachlabs.BasicExampleDAO.transferFunds:
'UPSERT INTO accounts (id, balance) VALUES('2d89869e-f274-4d21-b538-b2440c4f94fd', ((SELECT balance FROM accounts WHERE id = '2d89869e-f274-4d21-b538-b2440c4f94fd') - 100)),('75473622-9889-4cfb-9c32-30dba19ae116', ((SELECT balance FROM accounts WHERE id = '75473622-9889-4cfb-9c32-30dba19ae116') + 100))'
So far so good, let's run this with the Java argument, for that we're going to edit that same build.gradle file and add a new line
application {
// Define the main class for the application.
mainClass = 'com.cockroachlabs.BasicExample'
applicationDefaultJvmArgs = ['-Djdk.tls.client.protocols=TLSv1.2']
+ applicationDefaultJvmArgs = ['-Dorg.postgresql.pgpassfile=/tmp/.pgpass']
}
Run again to verify
./gradlew run
Finally, remove the argument and place the pgpass file back in $HOME. Once done, run the code again without passing any arguments. It should complete successfully.
For posterity, this is what it would look like if we were unable to read the password from the file
> Task :app:run
BasicExampleDAO.runSQL ERROR: { state => 08004, cause => null, message => The server requested password-based authentication, but no password was provided by plugin null
Topic 5: Avoid cruft with cockroach demo
cockroach demo is an invaluable tool assisting in demo of CockroachDB capabilities. Another method of spinning up an in-memory instance of CockroachDB without persisting any data is using --store=type=mem command. My gripe with both options is that when you spin up either, they leave residual files in the directory. With demo, you will find inflight_trace_dump and in case of --store=type=mem you will additionally find heap_profiler and goroutine_dump. The way to avoid creation of those directories is to set your directory read only and both methods will still work.
mkdir readwrite && cd readwrite
cockroach start-single-node --insecure --store=type=mem,size=0.25 --advertise-addr=localhost --background
in another terminal, issue ls -ltra readwrite
drwx------@ 464 artem staff 14848 Mar 7 15:28 ..
drwxr-x--- 2 artem staff 64 Mar 7 16:43 inflight_trace_dump
drwxr-x--- 2 artem staff 64 Mar 7 16:43 goroutine_dump
drwxr-xr-x 5 artem staff 160 Mar 7 16:43 .
drwxr-x--- 6 artem staff 192 Mar 7 16:44 heap_profiler
You can try the same with demo and you will notice inflight_trace_dump directory.
Stop the process, pkill cockroach and let's now try with a read only directory
mkdir readonly
chmod -w readonly
cd readonly
cockroach start-single-node --insecure --store=type=mem,size=0.25 --advertise-addr=localhost --background
In another terminal, issue ls -ltra readonly
➜ readonly ls -ltra
total 0
dr-xr-xr-x 2 artem staff 64 Mar 7 15:26 .
drwx------@ 464 artem staff 14848 Mar 7 15:28 ..
Comments