I have table categories with columns category_id and name. I want to generate new table products with random data. Table products has columns product_id, category_id, name, price. I want to give to each product random category_id.
My generate script for table products:
INSERT INTO products (category_id, name, price)
SELECT
(SELECT category_id FROM categories OFFSET floor(random()*50) LIMIT 1),
('Product ' || i),
round(cast(random() * 999 + 1 as numeric), 2)
FROM generate_series(1,100) as seq(i)
This script generate 100 rows of products, but each product has the same category_id. What I am doing wrong?
The "problem" is that Postgres is too aggressive about optimization. It is able to recognize that the subquery could be called just once -- somehow ignoring the fact that
random()is not a deterministic function.One method to fix this involves a correlated subquery: