I’m designing a SQL data schema with many tables that have a compound primary key of (customer_id, id). The application will frequently need to run JOIN queries to assemble data. However, it should never run cross customer joins. (Strictly disallowing cross customer joins could be a useful security feature, but business requirements might change.)
I’m looking at distributed SQL databases. Ideally I’d like to know that data across tables for the same customer_id lives on the same shard so joins don’t need to be executed cross-shard.
I’ve read through the documentation of Vitess, YugabyteDB, and CockroachDB. Of the three I only feel confident that Vitess will store data from the same customer on the same node.
Am I missing a feature of YugabyteDB/CockroachDB?
there are two aspects in that: security and performance. For security (tenant isolation) you don't need to dedicate nodes if you have Row Level Security. Here is an example on YugabyteDB (distributed SQL database): https://dev.to/yugabyte/hyper-scale-multi-tenant-for-saas-an-example-with-pgbench-33a3 For performance (avoid cross-node transactions) you can use declarative partitioning on top of it. The partitions, like created in the blog post, can be tagged to specific nodes through cloud/region/zone topology