Import a table from SQL Server into CockroachDB
a learning exercise only and not meant for production deployment. I welcome any feedback to
improve the process further. The fastest way to get started with SQL Server is via available
Docker containers. I’m using the following tutorial to deploy SQL Server on Ubuntu from my Mac.
sample database into my Docker container. You may also need SQL Server tools installed on your
host and you may find direction for Mac OS and Linux at the following site, users of Windows are quite
familiar with download location for their OS. I also used the following directions to install SQL Server
tools on my Mac but ran into compatibility issues with the drivers in my Docker container. This will be a
debug session for another day.
pull the SQL Server image
docker pull mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04
execute the container
docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=Cockroach!1' \
-p 1433:1433 --name sql1 \
-d mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04
change password
docker exec -it sql1 /opt/mssql-tools/bin/sqlcmd \
-S localhost -U SA -P 'Cockroach!1' \
-Q 'ALTER LOGIN SA WITH PASSWORD="CockroachDB1!"'
create a backup directory
docker exec -it sql1 mkdir /var/opt/mssql/backup
download WideWorldImporters database
curl -L -o wwi.bak 'https://github.com/Microsoft/sql-server-samples/releases/download/wide-world-importers-v1.0/WideWorldImporters-Full.bak'
copy the backup file into container
docker cp wwi.bak sql1:/var/opt/mssql/backup
list the logical names and paths for the backup
docker exec -it sql1 /opt/mssql-tools/bin/sqlcmd -S localhost \
-U SA -P 'CockroachDB1!' \
-Q 'RESTORE FILELISTONLY FROM DISK = "/var/opt/mssql/backup/wwi.bak"' \
| tr -s ' ' | cut -d ' ' -f 1-2
execute the restore command
docker exec -it sql1 /opt/mssql-tools/bin/sqlcmd \
-S localhost -U SA -P 'CockroachDB1!' \
-Q 'RESTORE DATABASE WideWorldImporters FROM DISK = "/var/opt/mssql/backup/wwi.bak" WITH MOVE "WWI_Primary" TO "/var/opt/mssql/data/WideWorldImporters.mdf", MOVE "WWI_UserData" TO "/var/opt/mssql/data/WideWorldImporters_userdata.ndf", MOVE "WWI_Log" TO "/var/opt/mssql/data/WideWorldImporters.ldf", MOVE "WWI_InMemory_Data_1" TO "/var/opt/mssql/data/WideWorldImporters_InMemory_Data_1"'
verify the restored database
docker exec -it sql1 /opt/mssql-tools/bin/sqlcmd \
-S localhost -U SA -P 'CockroachDB1!' \
-Q 'SELECT Name FROM sys.Databases'
mssql-cli
is a CLI utility used to connect to SQL Server. It is currently under heavy development
and offers a more pleasant experience over sqlcmd
, in my opinion. In the typical
new Microsoft fashion, the github page for the project can be found here.
It is also easier to install than sqlcmd
on my Mac.
install mssql-cli tool on your host
sudo pip install mssql-cli
in my case the install fails due to package six
being available on my system,
so the workaround is
sudo pip install mssql-cli --ignore-installed six
connect to your SQL Server instance directly
mssql-cli -U SA -P 'CockroachDB1!' -S localhost,1433 -d WideWorldImporters
run a sample query
SELECT StockItemID, StockItemName FROM WideWorldImporters.Warehouse.StockItems WHERE StockItemID>=1
redirect output of a query to a file we're going to use for import into CockroachDB
mssql-cli -U SA -P 'CockroachDB1!' -S localhost,1433 -d WideWorldImporters -Q "SELECT * FROM WideWorldImporters.Warehouse.StockItemTransactions" -o StockItemTransactions.csv
bcp
is a bulk copy utility that is better served for exporting data out of SQL Server
You can try installing sqlcmd
and mssqlodbc
locally or connect to the container
and use bcp
that comes bundled with SQL Server.
In my experience, setting up bcp
with brew
on OSX was a challenge, and
I opted for executing it inside a container instead.
docker exec -it sql1 bash
/opt/mssql-tools/bin/bcp WideWorldImporters.Warehouse.StockItemTransactions out /var/opt/mssql/backup/StockItemTransactionsBCP.csv -S localhost,1433 -U sa -P 'CockroachDB1!' -c -t',' -r'\n'
copy the file down to your host
docker cp sql1:/var/opt/mssql/backup/StockItemTransactionsBCP.csv .
describe StockItemTransactions
table
an equivalent of describe tablename
in SQL Server is a stored procedure below
exec sp_columns StockItemTransactions
The output is lengthy so it's best to output it to a file
mssql-cli -U SA -P 'CockroachDB1!' -S localhost,1433 -d WideWorldImporters -Q "exec sp_columns StockItemTransactions" -o schema.sql
+--------------------+---------------+-----------------------+-------------------------+-------------+-------------+-------------+----------+---------+---------+------------+-----------+----------------------------------------------+-----------------+--------------------+---------------------+--------------------+---------------+----------------+
| TABLE_QUALIFIER | TABLE_OWNER | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | PRECISION | LENGTH | SCALE | RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SS_DATA_TYPE |
|--------------------+---------------+-----------------------+-------------------------+-------------+-------------+-------------+----------+---------+---------+------------+-----------+----------------------------------------------+-----------------+--------------------+---------------------+--------------------+---------------+----------------|
| WideWorldImporters | Warehouse | StockItemTransactions | StockItemTransactionID | 4 | int | 10 | 4 | 0 | 10 | 0 | NULL | (NEXT VALUE FOR [Sequences].[TransactionID]) | 4 | NULL | NULL | 1 | NO | 56 |
| WideWorldImporters | Warehouse | StockItemTransactions | StockItemID | 4 | int | 10 | 4 | 0 | 10 | 0 | NULL | NULL | 4 | NULL | NULL | 2 | NO | 56 |
| WideWorldImporters | Warehouse | StockItemTransactions | TransactionTypeID | 4 | int | 10 | 4 | 0 | 10 | 0 | NULL | NULL | 4 | NULL | NULL | 3 | NO | 56 |
| WideWorldImporters | Warehouse | StockItemTransactions | CustomerID | 4 | int | 10 | 4 | 0 | 10 | 1 | NULL | NULL | 4 | NULL | NULL | 4 | YES | 38 |
| WideWorldImporters | Warehouse | StockItemTransactions | InvoiceID | 4 | int | 10 | 4 | 0 | 10 | 1 | NULL | NULL | 4 | NULL | NULL | 5 | YES | 38 |
| WideWorldImporters | Warehouse | StockItemTransactions | SupplierID | 4 | int | 10 | 4 | 0 | 10 | 1 | NULL | NULL | 4 | NULL | NULL | 6 | YES | 38 |
| WideWorldImporters | Warehouse | StockItemTransactions | PurchaseOrderID | 4 | int | 10 | 4 | 0 | 10 | 1 | NULL | NULL | 4 | NULL | NULL | 7 | YES | 38 |
| WideWorldImporters | Warehouse | StockItemTransactions | TransactionOccurredWhen | -9 | datetime2 | 27 | 54 | NULL | NULL | 0 | NULL | NULL | -9 | NULL | NULL | 8 | NO | 0 |
| WideWorldImporters | Warehouse | StockItemTransactions | Quantity | 3 | decimal | 18 | 20 | 3 | 10 | 0 | NULL | NULL | 3 | NULL | NULL | 9 | NO | 55 |
| WideWorldImporters | Warehouse | StockItemTransactions | LastEditedBy | 4 | int | 10 | 4 | 0 | 10 | 0 | NULL | NULL | 4 | NULL | NULL | 10 | NO | 56 |
| WideWorldImporters | Warehouse | StockItemTransactions | LastEditedWhen | -9 | datetime2 | 27 | 54 | NULL | NULL | 0 | NULL | (sysdatetime()) | -9 | NULL | NULL | 11 | NO | 0 |
+--------------------+---------------+-----------------------+-------------------------+-------------+-------------+-------------+----------+---------+---------+------------+-----------+----------------------------------------------+-----------------+--------------------+---------------------+--------------------+---------------+----------------+
(11 rows affected)
start a single node instance of Cockroach or use your own environment
cockroach start-single-node --insecure --host=localhost --port=26257 --background --external-io-dir $PWD
Note: Cockroach doesn't support schemas currently so we're going to place the table in default schema.
select * from stockitemtransactions where supplierid is not null;
connect to your instance of CockroachDB SQL Shell
cockroach sql --insecure
import the dataset
root@:26257/defaultdb> CREATE DATABASE IF NOT EXISTS WideWorldImporters;
USE WideWorldImporters;
DROP TABLE IF EXISTS StockItemTransactions;
IMPORT TABLE StockItemTransactions (
StockItemTransactionID INT8 NOT NULL,
StockItemID INT8 NOT NULL,
TransactionTypeID INT8 NOT NULL,
CustomerID STRING NULL, --INT8
InvoiceID STRING NULL, --INT8
SupplierID STRING NULL, --INT8
PurchaseOrderID STRING NULL, --INT8
TransactionOccurredWhen TIMESTAMP NOT NULL,
Quantity DECIMAL NOT NULL,
LastEditedBy INT8 NOT NULL,
LastEditedWhen TIMESTAMP NOT NULL
)
CSV DATA ('nodelocal:///StockItemTransactionsBCP.csv');
CREATE DATABASE
Time: 572µs
SET
Time: 196µs
DROP TABLE
Time: 76.272ms
job_id | status | fraction_completed | rows | index_entries | system_records | bytes
+--------------------+-----------+--------------------+--------+---------------+----------------+----------+
509222924675055617 | succeeded | 1 | 236667 | 0 | 0 | 15048500
(1 row)
Time: 443.022ms
One thing you should know is that some features are still maturing in Cockroach and
as I highlighted in my tutorial, schemas are not available and data type conversions need more work.
I opted using STRING
data type for CustomerID
, InvoiceID
, SupplierID
and PurchaseOrderID
to complete the tutorial but there should be a better way.
Comments