The design of SQL and relational database systems makes it easy to accidentally introduce serious concurrency bugs. Below is a textbook money-transfer procedure in TSQL; Alice wants to send ten dollars to Bob, and to keep Alice from overdrafting her account, we first check that she has enough money. The code looks completely reasonable, but it has several critical bugs. Can you spot them?
DECLARE @balance INT;
SET @balance = (
SELECT balance
FROM accounts
WHERE owner = 'alice'
);
IF @balance >= 10
BEGIN
UPDATE accounts
SET balance = balance - 10
WHERE owner = 'alice';
UPDATE accounts
SET balance = balance + 10
WHERE owner = 'bob';
END
Atomicity
First, if this procedure aborts midway through, we might transfer money from Alice’s account without transferring any to Bob. Alice won’t be happy about that, and we’ve destroyed money in the process. We want all of the transfers to succeed, or none of them; the fix is to wrap the procedure in a transaction:
BEGIN TRANSACTION;
DECLARE @balance INT;
SET @balance = (
SELECT balance
FROM accounts
WHERE owner = 'alice'
);
IF @balance >= 10
BEGIN
UPDATE accounts
SET balance = balance - 10
WHERE owner = 'alice';
UPDATE accounts
SET balance = balance + 10
WHERE owner = 'bob';
END
COMMIT TRANSACTION;
TOCTOU
Are we done yet? Not quite. Suppose Alice fires off two transfers to Bob in parallel, T1 and T2. Let’s map out what happens:
- T1: Check Alice’s account balance
- T2: Check Alice’s account balance
- T1: Withdraw 10 from Alice’s account
- T2: Withdraw 10 from Alice’s account
- T1: Deposit 10 in Bob’s account
- T2: Deposit 10 in Bob’s account
Note how T2 checks the balance before T1 has withdrawn any money from Alice’s account—so when T2 finally withdraws, the account might become overdrafted. This is a Time-of-check to time-of-use (TOCTOU) bug: The precondition changes between when we check it and when we act on it.
The fix is to lock Alice’s account until the transaction completes. We can change the isolation level so locks are acquired automatically, or lock the account row by hand:
BEGIN TRANSACTION;
DECLARE @balance INT;
SET @balance = (
SELECT balance
-- This is roughly equivalent
-- to SELECT FOR UPDATE
FROM accounts WITH (UPDLOCK)
WHERE owner = 'alice'
);
IF @balance >= 10
BEGIN
UPDATE accounts
SET balance = balance - 10
WHERE owner = 'alice';
UPDATE accounts
SET balance = balance + 10
WHERE owner = 'bob';
END
COMMIT TRANSACTION;
The UPDLOCK hint takes a row-level lock on Alice’s account when the SELECT
runs; other transactions that want to modify Alice’s account will block until
the lock is released.
Deadlocks
What if Alice and Bob both try to transfer money to each other at the same time? Let’s map out the transactions again:
- T1: Acquire a lock on Alice’s account
- T2: Acquire a lock on Bob’s account
- T1: Check Alice’s account balance
- T2: Check Bob’s account balance
- T1: Withdraw 10 from Alice’s account
- T2: Withdraw 10 from Bob’s account
- T1: Can’t update Bob’s account because it’s locked by T2
- T2: Can’t update Alice’s account because it’s locked by T1
T1 waits for T2’s lock on Bob; T2 waits for T1’s lock on Alice—we’re stuck in a deadlock. The fix is to acquire all locks upfront:
BEGIN TRANSACTION;
DECLARE @balance INT;
SELECT owner
FROM accounts WITH (UPDLOCK)
WHERE owner IN ('alice', 'bob');
SET @balance = (
SELECT balance
FROM accounts
WHERE owner = 'alice'
);
IF @balance >= 10
BEGIN
UPDATE accounts
SET balance = balance - 10
WHERE owner = 'alice';
UPDATE accounts
SET balance = balance + 10
WHERE owner = 'bob';
END
COMMIT TRANSACTION;
Conclusion
We’ve fixed the concurrency bugs in the original code, but in the process it grew about 50%, and became harder to read. Sure, you could argue that there are other, more idiomatic ways to fix this code, but the point still stands: A SQL program that looks completely reasonable can be riddled with serious bugs.
If you’re building a social media site, it might not be the end of the world if a user likes a post twice, but if a system fails to record that a patient received a dose of medicine, it might have fatal consequences. For systems where correctness matters, we need better tools.
Proposed solution
I want an alternative to SQL that adopts Rust’s approach of fearless concurrency—that is, make the correct behavior the default, and provide “unsafe” escape hatches if necessary. Some concrete suggestions:
- Make transactions atomic by default; if the user wants to save an intermediate “checkpoint” state they would have to say so explicitly.
- Let the user manage locks themselves, and make sure the correct locks are acquired before mutating a database object.
- Use static analysis to detect potential deadlocks; this is a tricky problem and a subject of ongoing research. Deterministic database systems could be one possible solution.
This system will come with other trade-offs; for example, it might end up with lower throughput than modern SQL systems. But that’s fine—we still have SQL for use cases where correctness is less important.