Showing posts from 2020

Scaling CockroachDB to 200k writes per second

Performance characteristics working with JSONB in CockroachDB

Working with multi-level JSON in CockroachDB

CockroachDB with MIT Kerberos using a native client

Introducing: CockroachDB Kubernetes Operator on the Red Hat OpenShift Platform

CockroachDB performance characteristics with YCSB(A) benchmark

This month I was on a hiatus in terms of blog posts and had to focus on the other parts of my life. Nevertheless, this post was long time coming and I was ambitiously waiting for a quiet time to get this out of the door. Today, I'm talking about CockroachDB and the ever so popular YCSB benchmark suite. Specifically, I'm going to step through the workload A in the YCSB suite of workloads against CockroachDB.

CockroachDB with SQLAlchemy and MIT Kerberos

Today, I'm going to demonstrate how easily we can integrate a SQLAlchemy application running against a kerberized CockroachDB cluster. The experience building this appliation was almost too easy.

CockroachDB with Django and MIT Kerberos

Today, I'm going to talk about means of using Django with a kerberized CockroachDB and what that entails. This is not uncommon in a production use case and expecting enterprise-grade access to development frameworks is table stakes for some of our customers.

Three-headed dog meet cockroach, part 6: CockroachDB, MIT Kerberos, HAProxy and Docker Compose

Today, I'm going to try to simplify our architecture or at least management of Kerberos artifacts as they relate to CockroachDB by introducing a load balancer. Given a presence of LB, we can obfuscate the CockroachDB cluster architecture from Kerberos and ease the managment of Kerberos keytabs as well as service principal names.

Three-headed dog meet cockroach, part 5: Executing CockroachDB table import via GSSAPI

Today, we're going to look at an actual question I was asked when it comes to Kerberos.

Three-headed dog meet cockroach, part 4: CockroachDB with Kerberos and a custom Service Principal Name

This is my fourth installment of CockroachDB and Kerberos coverage. Today, I'm going to demonstrate a way to override a service principal name from the default.

Three-headed dog meet cockroach, part 3: CockroachDB, MIT Kerberos in a container!

This is the third in the series of articles on CockroachDB and GSSAPI. Today, we're going to automate all things with Docker and docker-compose specifically to stand-up a quick and repeatable environment to troubleshoot CockroachDB and Kerberos. Most of the articles I write are based on specific corner cases my customers are stumbling on. Today, I'm going to talk about a basic environment I would use to debug a situation with Kerberos and CockroachDB integration. In the following posts, I will cover actual corner cases I've seen to date.

Exploring CockroachDB with Flyway Schema Migration tool

Back in May, we announced support for Flyway , a popular schema migration tool. Prior to joining Cockroach Labs, I was unfamiliar with the schema migration concept and this was a good opportunity to dip my toes in. Today, I am going to quickly introduce you to Flyway and some of the new capabilities in CockroachDB 20.1 leveraging schema migrations. This is by no means a deep-dive on Flyway, for that, I highly recommend you get familiar with Flyway's documentation . Also, it's a good opportunity to review our 20.1 release announcement for all the goodness in our current release. With that, let's dive in.

Three-headed dog meet cockroach, part 2: CockroachDB with Active Directory

Today, I am going to discuss CockroachDB integration with Active Directory. AD is the commercial brother of Kerberos maintained by Microsoft. AD is a de facto authentication standard across large enterprises and our customers expect products calling themselves enterprise to work seamlessly with Active Directory. Hence, here's my write-up with end to end steps to deploy a lab environment to try on your own.

Three-headed dog meet cockroach: CockroachDB with MIT Kerberos

CockroachDB is a cloud native distributed database that works across various cloud, hybrid and on premise environments. The flexibility of deployments demand varying degrees of security protocols. Most of the time, on premise customers won't accept anything less than Kerberos for their system of record authentication mechanisms. In my Hadoop time, that was the bare minimum requirement to play. CockroachDB today supports Kerberos via GSSAPI for authentication. In this post, I'm going to walk you through setting up Kerberos for CockroachDB and provide a sort of cheat sheet, to make this process more seamless. I'm using a single Centos VM provisioned with Vagrant. It servers as my KDC as well as my CockroachDB instance. On to the setup. The following documents may assist in the entire process: CockroachDB GSSAPI , how to install CockroachDB and configuring CockroachDB for secure access . I recorded the entire process with Asciinema, I split the screencast into two parts. P

Secure CockroachDB with Custom Common Name

CockroachDB out of the box comes with ability to generate certificates with cockroach cert command. This command will provision certs for client and nodes. One common gap we get from our customers is the explicit reliance on CN=node and CN=root . In our latest development release, we're introducing ability to map root and node principals to custom CNs. The process bypasses cockroach cert command in favor of openssl utility. It is very well documented and I recorded a live walk-through of the entire process. I am including my openssl configuration files for convenience: ca.cnf # OpenSSL CA configuration file [ ca ] default_ca = CA_default [ CA_default ] default_days = 365 database = index.txt serial = serial.txt default_md = sha256 copy_extensions = copy unique_subject = no # Used to create the CA certificate. [ req ] prompt=no distinguished_name = distinguished_name x509_extensions = extensions [ distinguished_name ] organizationName = Example Inc commonName = Exampl

What is insecure may never break: CockroachDB insecure cluster take over

I came across an interesting scenario last week. A customer had asked whether it is possible to secure a previously insecure cluster. The short answer is yes. Now, Cockroach Labs does not recommend running an insecure cluster in production. There are only a few additional steps necessary to secure an instance, so why do it? Convenience, you say. It can hurt you down the line but fret not, this article will demonstrate how to fix this. We are going to follow the standard insecure cluster start up procedure . Once complete, we're going to flip to the documentation for a secure cluster to turn each node on with security enabled. Here's a handy video of the procedure in action: I also included the step by step instructions below:

CockroachDB statement redirection from an external file

Exploring CockroachDB with ipython-sql aka sqlmagic and Jupyter Notebook

Today, I will demonstrate how ipython-sql can be leveraged in querying CockroachDB. This will require a secure instance of CockroachDB for the reasons I will explain below. Running a secure docker-compose instance of CRDB is beyond the scope of this tutorial. Instead, I will publish everything you need to get through the tutorial in my repo , including the Jupyter Notebook. You may also use CRDB docs to stand up a secure instance and change the url in the notebook to follow along.

Exploring CockroachDB with Jupyter Notebook and Microsoft PowerShell

Today, we're going to venture out into the world of .Net through a scripting language out of Microsoft called PowerShell. My familiarity with .Net is quite minimal but I do have an extensive background in PowerShell scripting, albeit going years back. Pardon me for being a bit rusty. I've always loved PowerShell when I was working on the Microsoft platform, it allows for interactive and object oriented approach working with databases. Scripting admin tasks for DBAs on Windows was always a challenge for me until PowerShell came into the picture. I had to maintain many database servers and PowerShell became my best friend. Today, I will show you how PowerShell can become your best friend working with CockroachDB! Note: The title is a bit misleading as you will see this tutorial is more about exploring PowerShell from the console rather than Jupyter Notebook but I do make my best effort to emphasize what does and does not work today in Jupyter when it comes

Exploring CockroachDB with Jupyter Notebook and R

Today, we're going to explore CockroachDB from the Data Science perspective again. We will continue to use Jupyter notebook but instead of Python, we're going to use the R language. I was inspired to write this post based on an article written by my colleague. I will build on that article by introducing Jupyter Notebook to the mix.

Exploring CockroachDB with Jupyter Notebook and Python

Today, we're going to explore CockroachDB from the Data Science perspective, using a popular exploratory web tool called Jupyter Notebook. I was inspired to write this post based on this article . The article goes over using Jupyter with Oracle, MySql and Postgresql, we're going to do the same with Cockroach! One caveat here is the heavy reliance on ipython-sql library. We're going to use Pandas library as the ipython-sql magic functions are not compatible with Cockroach today. Hopefully you will find it useful.

Import Hadoop HDFS data into CockroachDB

Today we're going to take a slight detour from docker compose and evaluate ingestion of data from Hadoop into Cockroach. One word of caution, this is being tested on an unsecured cluster with very small volume of data. Always test your own set up before taking public articles for face value! CockroachDB can natively import data from HTTP endpoints, object storage with respective APIs and local/NFS mounts. The full list of supported schemes can be found here . It does not support HDFS file scheme and we're left to our wild imagination to find alternatives. As previously discussed, Hadoop community is working on Hadoop Ozone, a native scalable object store with S3 API compatibility. For reference, here's my article demonstrating CockroachDB and Ozone integration. The limitation here is that you need to run Hadoop 3 to get access to it. What if you're on Hadoop 2? There are several choices I can think of off the top of my head. One approach is to

CockroachDB CDC using Hadoop Ozone S3 Gateway as cloud storage sink, Part 4

Today, we're going to evaluate Hadoop Ozone object store for CockroachDB object store sink viability. A bit of caution, this article only explores the art of possible, please use the ideas in this article at your own risk! Firstly, Hadoop Ozone is a new object store Hadoop Community is working on. It exposes an S3 API backed by HDFS and can scale to billions of files on prem! This article only scratches the surface, for everything there is to learn about Hadoop and Ozone , navigate to their respective websites.

CockroachDB CDC using Minio as cloud storage sink, Part 3

Today, we’re going to explore CDC capability in CockroachDB Enterprise Edition using Minio object store as sink. To achieve this, we’re going to reuse the compose file from the first two tutorials and finally bring this to a close. Without further ado