Saturday, July 21, 2012

Thread.Yield() and Locking mechanisms

I've started writing this article about explaining what Thread.Yield() does, first I wrote two spinlocks, one using an empty loop and one using Thread.Yield, it showed that Thread.Yield was actually working faster because it releases the rest of the allotted time-slice by the scheduler, allowing other threads to finish until the lock can be acquired.

However, I've started wondering how different locking mechanisms affect our daily multi-threading work and decided to make it a comparison project for future reference.

There's a lot of confusion with Thread.Yield, is it the same as Thread.Sleep(0)? is it like Thread.SpinWait? is it a NOP

I wrote 6 spinlocks, one with NOP, one with Thread.Yield, one with Thread.Sleep(0), one with Thread.Sleep(1), one with Spinwait, one complex (containing the other types and imitating Microsoft's implementation) and used Microsoft's implementation of SpinLock, they did a great job there by including Yield, Spinwait, Sleep(0) and Sleep(1) the longer the wait took, the less frequently they try to acquire the lock, good job!

I've also implemented locks based on Monitor.Enter/ExitMutex (WaitOne/ReleaseMutex),   Semaphore (WaitOne, Release) and SemaphoreSlim (Wait, Release) for the sake of comparison.

Take a guess which one is faster, we'll see the results later.

So what are the waiting methods? how do they work? 

SpinWait, its the simplest one, basically its a loop (hence the iterations parameter) that occupies the CPU (or Core its running on to be exact), if you do a SpinWait in a loop, you'll see its using the maximum CPU it can. I don't think there are any other implementations you can use it other than locking mechanisms and so says Microsoft's documentation.


int i = 1;
while (Interlocked.CompareExchange(ref _lock, 1, 0) != 0)
{
    Thread.SpinWait(++i);
}

Yield, tells the scheduler that the current thread is done with its allotted CPU slice, you may execute something else on the same core.

while (Interlocked.CompareExchange(ref _lock, 1, 0) != 0)
{
    Thread.Yield();
}

Sleep(0), according to the documentation it tells the scheduler that the current thread is done with its allotted slice, BUT when you try to benchmark it, its running longer than Thread.Yield. so I'm afraid I don't have a good answer for you.

while (Interlocked.CompareExchange(ref _lock, 1, 0) != 0)
{
    Thread.Sleep(0);
}

Sleep(1) tells the scheduler to stop executing the current thread until timeout has reached and then marks it for execution, which will add a bit more time until the next schedule cycle.

while (Interlocked.CompareExchange(ref _lock, 1, 0) != 0)
{
    Thread.Sleep(1);
}


What are the locking methods?

Monitor is a fine grained locking mechanism, provides a lot of options if you want to wait or wake up a waiting lock, it should be used if you have locks waiting and want to control how many you want to wake up (Pulse/PulseAll).


//Enter
Monitor.Enter(_lock);

//Exit
Monitor.Exit(_lock);

Mutex is a system-wide locking mechanism, used to control flow in inter process communication or shared memory implementations, its a wrapper for system calls, which is actually slower than pure CLR implementations, if you got to a point you need to use Mutexes, this article is not for you. :-)


//Create
Mutex _lock = new Mutex();

//Enter
_lock.WaitOne();

//Exit
_lock.ReleaseMutex();

Semaphore is more of a resource limiting lock, its using the kernel for locking so its similarly slow as Mutex, you initialize the semaphore with how many slots you want and then ask for one and release one when you're done, when the limit is reached it will wait until one lock is released.


//Create
Semaphore _lock = new Semaphore(1, 2);

//Enter
_lock.WaitOne();

//Exit
_lock.Release();

SemaphoreSlim is like Semaphore but it doesn't use the kernel, so its relatively fast, but not as fast as one might want, if you need to use very fast semaphores, maybe you should experiment with your own implementation using SpinLock..


//Create
SemaphoreSlim _lock = new SemaphoreSlim(1, 2);

//Enter
_lock.Wait();

//Exit
_lock.Release();


I wanted to see how safe these locks are so I've added an Interlocked.Increment and Interlocked.Decrement and checked that the value is not larger then 1 inside the locked code.

Microsoft added Thread.BeginCriticalRegion() and Thread.EndCriticalRegion() to their implementation to avoid having the lock in an inconsistent state, but its beyond the scope of this test.

I think the results speak for themselves and next time we have to select a lock for a faster application, we can be more educated about our decision. 

Graphs - http://uhurumkate.blogspot.com/p/locking-benchmarks.html

Project - https://github.com/drorgl/ForBlog/tree/master/LockingTests

No comments:

Post a Comment