Foreign keys may be the best approach for this problem. However, I'm trying to learn about table locking/transactions, and so I'm hoping that we can ignore them for the moment.
Let's pretend that I have two tables in an InnoDB database: categories and jokes; and that I'm using PHP/MySQLi to do the work. The tables look like so:
CATEGORIES
id (int, primary, auto_inc) | category_name (varchar[64])
============================================================
1 knock, knock
JOKES
id (int, primary, auto_inc) | category_id (int) | joke_text (varchar[255])
=============================================================================
empty
Here are two functions, each of which is being called by a different connection, at the same time. The calls are: delete_category(1) and add_joke(1,"Interrupting cow. Interrup-MOOOOOOOO!")
function delete_category($category_id) {
// only delete the category if there are no jokes in it
$query = "SELECT id FROM jokes WHERE category_id = '$category_id'";
$result = $conn->query($query);
if ( !$result->num_rows ) {
$query = "DELETE FROM categories WHERE id = '$category_id'";
$result = $conn->query($query);
if ( $conn->affected_rows ) {
return true;
}
}
return false;
}
function add_joke($category_id,$joke_text) {
$new_id = -1;
// only add the joke if the category exists
$query = "SELECT id FROM categories WHERE id = '$category_id'";
$result = $conn->query($query);
if ( $result->num_rows ) {
$query = "INSERT INTO jokes (joke_text) VALUES ('$joke_text')";
$result = $conn->query($query);
if ( $conn->affected_rows ) {
$new_id = $conn->insert_id;
return $new_id;
}
}
return $new_id;
}
Now, if the SELECT statements from both functions execute at the same time, and proceed from there, delete_category will think it's okay to delete the category, and add_joke will think it's okay to add the joke to the existing category, so I'll get an empty categories table and an entry in the joke table that references a non-existent category_id.
Without using foreign keys, how would you solve this problem?
My best thought so far would be to do the following:
1) "LOCK TABLES categories WRITE, jokes WRITE" at the start of delete_category. However, since I'm using InnoDB, I'm quite keen to avoid locking entire tables (especially main ones that will be used often).
2) Making add_joke a transaction and then doing "SELECT id FROM categories WHERE id = '$category_id'" after inserting the record as well. If it doesn't exist at that point, rollback the transaction. However, since the two SELECT statements in add_joke might return different results, I believe I need to look into transaction isolation levels, which I'm not familiar with.
It seems to me that if I did both of those things, it should work as expected. Nevertheless, I'm keen to hear more informed opinions. Thanks.
You can DELETE a category only if is no matching joke:
If there are any jokes for the category, the join will find them, and therefore the outer join will return a non-null result. The condition in the WHERE clause eliminates non-null results, so the overall delete will match zero rows.
Likewise, you can INSERT a joke to a category only if the category exists:
If there is no such category, the SELECT returns zero rows, and the INSERT is a no-op.
Both of these cases create a shared lock (S-lock) on the categories table.
Demonstration of an S-lock:
In one session I run:
In second session I run:
You can see that this creates an IS-lock on the table foo, and an S-lock on one row of foo, the table I'm reading from.
The same thing happens for any hybrid read/write operations such as
SELECT...FOR UPDATE,INSERT...SELECT,CREATE TABLE...SELECT, to block the rows being read from being modified while they are needed as a source for the write operation.The IS-lock is a table-level lock that prevents DDL operations on the table, so no one issues
DROP TABLEorALTER TABLEwhile this transaction is depending on some content in the table.