Which distributed SQL databases put data from different tables with the same tenant in the same node?

619 Views Asked by At

I’m designing a SQL data schema with many tables that have a compound primary key of (customer_id, id). The application will frequently need to run JOIN queries to assemble data. However, it should never run cross customer joins. (Strictly disallowing cross customer joins could be a useful security feature, but business requirements might change.)

I’m looking at distributed SQL databases. Ideally I’d like to know that data across tables for the same customer_id lives on the same shard so joins don’t need to be executed cross-shard.

I’ve read through the documentation of Vitess, YugabyteDB, and CockroachDB. Of the three I only feel confident that Vitess will store data from the same customer on the same node.

Am I missing a feature of YugabyteDB/CockroachDB?

2

There are 2 best solutions below

0
FranckPachot On

there are two aspects in that: security and performance. For security (tenant isolation) you don't need to dedicate nodes if you have Row Level Security. Here is an example on YugabyteDB (distributed SQL database): https://dev.to/yugabyte/hyper-scale-multi-tenant-for-saas-an-example-with-pgbench-33a3 For performance (avoid cross-node transactions) you can use declarative partitioning on top of it. The partitions, like created in the blog post, can be tagged to specific nodes through cloud/region/zone topology

0
Rima On

CockroachDB does not support collocated tables as requested here. This was previously supported in the form of interleaved tables but was deprecated recently as the implementation was hard to maintain and did not yield significant performance wins as expected. The performance aspects are discussed at length on this GH thread https://github.com/cockroachdb/cockroach/issues/52009 if you are interested in learning more.