Lock
and Mutex
are very basic ideas in every language and OS.
Recently, I think for a while about Lock
/ Mutex
/ SpinLock
/ Reentrant Lock
/ Distributed Lock
, and find something interesting.
1. Mutex
Mutex
(Lock
) is easy, it locks() a critical code block and then will mutual exclusively do something, and after that will unlock() the critical code block, and everything goes on.
It keeps the critical code block
running in the order as your wrote, but not the using the re-ordered sequence by Memory Ordering.
So what Memory_ordering
will break then? Let’s call whatever it could break as invariant here right now.
1 | l := sync.Mutex{} |
By holding the lock, means I need some invariant
(s) hold, and probably I will break
these invariant
(s) in the next few lines of critical code block
.
But don’t worry, after Unlock()
invocation, that means:
- I longer need these
invariant
(s) anymore, and - If I broke them in the critical code block, I have already restored these
invariant
(s)
So, Mutex
is a tool to help you to confidently
keep the invariant
as what it should be after critical code block, by preventing the memory re-order
, either by Compiler or by OS at the runtime.
2. SpinLock
2.1. Compare and Swap
Compare and Swap, or CAS
, is an easy way to implement synchronization mechanism. Here is the logic:1
2
3
4
5
6
7
8// needs atomic implementation
function cas(p : pointer to int, old : int, new : int) returns bool {
if *p ≠ old {
return false
}
*p ← new
return true
}
So it will compare if the current value is the old value:
- if not, means,
replaced by others
, i.e.,the lock is held by others
, will just return false - if yes, means,
not replaced by others yet
, i.e.,hold the lock
, will just replace using the new value
Of course, the most import is the above cas
function needs to be executed atomically
, either using lock or implemented by hardware/CPU CompareAndSwap itself.
2.1.1. CAS and Lock-Free Data Structure
With CAS feature, we can implement some lock-free thread-safe data structure. Here is the Enqueue()
method for a lock-free queue.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27Enqueue(x)
q ← new record
q^.value ← x
q^.next ← NULL
p ← tail
oldp ← p
repeat
while p^.next ≠ NULL
p ← p^.next
// iterate until p^.next == NULL, which mens,
// p is the very last one element, and swap, p.next = q
until Compare&Swap(p^.next, NULL, q)
Compare&Swap(tail, oldp, q)
end
Dequeue()
repeat
p ← head
if p^.next = NULL
error queue empty
// iterate until head == p, which means,
// p is the very head of this queue, and swap, head point to p.next
until Compare&Swap(head, p, p^.next)
return p^.next^.value
end
- No lock used above, but the Queue data structure is thread safe
Enqueue
keeps checking the invariantp.next == NULL
, ortail
not changedDequeue
keeps checking the invarianthead == p
, orhead
not changed
So, again, essentially point out that, concurrency model needs to protect invariant
keep as it should be.
In a summary, it use a retry logic in loop until the invariant
is satisfied, and once the invariant
is satisfied, it immediately change the value to a new one, under the condition that, the CAS
step is atomic!
So, if you want to implement a lock free data structure, put the check-and-set
procedure using a Loop-CAS implementation.
2.1.2. CAS and Optimistic Lock Update
With CAS feature, we can also implement optimistic lock.
For example, using SQL1
Update colume_A = value, version = previous_value+1 where version = previous_value;
to implement a Compare and Swap
semantics. Plus a loop, we are good to implement the Loop-CAS
, i.e, lock-free Update.
Here is an example of Laravel
, updated_at
will be automatically updated each time of update request to the row.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39function transfer($fromAccountId, $toAccountId, $balance)
{
$fromQuery = Account::whereId($fromAccountId);
if (! $fromQuery->exists()) {
throw new InvalidAccountException();
}
$toQuery = Account::whereId($toAccountId);
if (! $toQuery->exists()) {
throw new InvalidAccountException();
}
// step 1, decrease money from $fromAccount
do {
// get the value of `updated_at`
$fromAccount = $fromQuery->first();
if ($fromAccount->balance < $amount) {
throw new InsufficientBalanceException();
}
// if updated_at is not changed in the middle
// will change the Account in DB
$updated = Account::whereId($fromAccountId)
->where('updated_at', '=', $fromAccount->updated_at)
->update(['balance' => $fromAccount->balance - $amount]);
} while (! $updated);
// step 2, add money into $toAccount
do {
$toAccount = $toQuery->first();
$updated = Account::whereId($toAccountId)
->where('updated_at', '=', $toAccount->updated_at)
->update(['balance' => $toAccount->balance + $amount]);
} while (! $updated);
// step 3, save transaction record
$transaction = new Transaction();
$transaction->from_account_id = $fromAccountId;
$transaction->to_account_id = $toAccountId;
$transaction->amount = $amount;
$transaction->save();
}
The good thing is: lock free, with higher performance.
But the bad thing is, or not that bad, just not as good as Pessimistic Lock, it’s kind of Eventual Consistency.
If another thread read data in the middle time of step 2, it will read an inconsistent data state between the fromAccount and the toAccount.
Because during each step1, step2, step3, it ensure the invariant only inside these steps (update_at is not changed), but not across these steps (toAccount.balance + fromAccount.balance == totalMoney
).
2.2. SpinLock Implementation using CAS
If we put more eye into the Optimistic Lock example above, we can know step 2 is blocked by step 1, which is using a Loop-CAS
model.
So what if we think step 1 as a Lock()
method, which means, only step 1 is executed, step 2 could start. Here we got the idea of SpinLock() to use CAS.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24package spinlock
import (
"runtime"
"sync"
"sync/atomic"
)
type SpinLock uint32
func NewSpinLock() sync.Locker {
var lock SpinLock
return &lock
}
func (sl *SpinLock) Lock() {
// keep checking, until sl == 0, i.e, nobody locks it,
// and then change the sl = 1, i.e, lock it
for !atomic.CompareAndSwapUint32((*uint32)(sl), 0, 1) {
runtime.Gosched()
}
}
func (sl *SpinLock) Unlock() {
atomic.StoreUint32((*uint32)(sl), 0)
}
3. When to SpinLock? When to Mutex?
Differences?
SpinLock
: As you can see, SpinLock idea comes from Loop-CAS, which islock-free
, but kindabusy checking
.Mutex
: WhenLock()
invocation happens, however can’t get the lock, the thread/goroutine will be added to wait list of scheduler and thread/goroutine switch happens at this time.
So SpinLock
thinks that I should very probably
get the lock in my scheduling CPU slot, and in a aggressive way.
What about Mutex
scheduling? In Golang implementation, it’s:
- If Lock() happens and can’t get the lock, will be added to wait list,
- For new young goroutine who already get the lock, will just let them run, since they already get the lock,
- If the old goroutine has been wait for very long time, will give the lock to him and let him run, so it won’t wait for too long. By default,
starvationThresholdNs
is1e6
nanoseconds (1ms
). Normally it’s FIFO order, but ifstarvationThresholdNs
meet, will enter starvation mode.
In a summary:
- Under
multi-core
CPU, if you think that you will quickly get the lock, then please useSpinLock
, so that it doesn’t need context switch to get the lock and execute the code.- the time to get the lock is short (
short task or low concurrency
) - the critical code block running time cost won’t be too much, i.e., can quickly do the shared variables computation and release the lock
- the concurrency is not too much, or it will needs lots of context switches to release the lock
- the time to get the lock is short (
- Under
single-core
CPU, never useSpinLock
, since it will try to use up the CPU slot, until context switch happens and then back to get the lock.Context Switch
always happens in 1 Core CPU. But the aboveSpinLock
code implementation is good, since it always yield the CPU to the other goroutines by usingruntime.Gosched()
. So theSpinLock
implementation here is a bit like aMutex
butlock-free
. - Under
multi-core
CPU, if you think your critical code block can’t complete in the assigned CPU slot, please try to useMutex
, so that it won’t waste up much time on getting the lock.- the time to get the lock is long (
long task or high concurrency
) - the critical code block running time cost is little large, i.e, keeps computation on top of shared variables for a long time. So it’s important not to put
I/O
into critical code block.
- the time to get the lock is long (
To sum up:
Single-core
: not useSpinLock
, or at least yield the processor every time failedCAS
.Multi-core
,short task or low concurrency
: useSpinLock
.Multi-core
,long task or high concurrency
: useMutex
.Don't use I/O in critical code block
, this should be a common knowledge.
If high concurrency
with heavy I/O
, then better to use goroutine
+ channel
solution.
4. Reentrant Lock or Recursive Lock, why not?
TBD
5. Distributed Lock
TBD
Ref: