If you have any doubts in the below, contact us by dropping a mail to the Kung Fu Panda.
We will get back to you very soon.
We have a variable and there are multiple threads trying to reading and writing to it simultaneously.
We want to prevent one thread from modifying the state of a variable, when other is using it.
We also want to ensure that when a thread modifies the state of an object, other threads can see it.
How do we manage the state of a variable.
Ways to fix synchronization issues
don't share the state variable between threads.
make the state variable immutable(once written, it cannot be changed).
apply appropriate synchronization whenever accessing the state variable(particularly in a transaction)
A Thread safe class behaves correctly when accessed from multiple threads, regardless of the scheduling and execution of the thread.
A Thread safe class does not require any additional synchronization code by the calling class.
Stateless objects are always thread-safe.
++count may look atomic, but its not, does not execute as a single indivisible operation, but is 'fetch add write'.
Use AtomicLong instead, and its method incrementAndGet.
java.util.concurrent.atomic package contains such Atomic Classes, ie AtomicReference.
A Race condition occurs when the correctness of a computation depends on the timing or interleaving of different threads.
Every Java object can implicitly act as a lock for purpose of synchronization, these built in locks are called intrinsic locks or monitor locks.
The lock is automatically acquired by thread entering and released by thread when it is leaving the method.
Intrinsic locks act as Mutexes (Mutual Exclusion Locks).
If a thread tries to acquire a lock it already holds, it succeeds, as locks are acquired on per thread instead of per invocation basis.
Implemented by relating the count of how many times a thread acquired and released the lock on the object.
We should use synchronization everywhere a variable shared between threads is accessed, not only where it is changed.
Also, everywhere, the same lock object must be used.
Memory Visibility and Reordering
When a thread is using a variable, we want to avoid other threads from changing that variable and want to ensure that when a thread changes a variable, it is immediately visible to other threads.
It is done using synchronization
In absense of synchronization, one thread may see the changes in variable made by another thread, or may not, or may see in a different order, or one thread may see stale data.
In the absense of synchronization, the JVM permits the compiler to reorder operation and cache values in registers, and permits CPU to reorder operations and cache values in processor specific cache.
Intrinsic locking can be used to guarantee that one thread sees the changes made by another thread in a predictable manner.
weaker form of synchronization.
if a variable is declared volatile, compiler and JVM are told that the variable is shared, and its values should not be cached in registers, and should not be reordered.
Commonly used for flags, like completion flag, interruption flags etc.
simplest ways to achieve thread safety, if the data is only accessible from one thread at a time.
ThreadLocal<T> holds a Map<Thread, T>
used in many frameworks like Hibernate, Spring where static threadlocal holding the transactional context/connection etc.
variables restricted in a stack or in a method.
variables declared in a method are always thread safe, because they are stored in a stack, and each thread has its own thread stack.
Immutability is NOT equivalent to declaring all fields final, since final fields can hold references to mutable objects.
A properly constructed object can be safely published by
initializing an object reference from a static initializer.
storing a reference to it into a volatile field or AtomicReference.
storing a reference to it into a final field of a properly constructed object.
storing a reference to it into a field that is properly guarded by a lock.
Adding functionality to ThreadSafe Classes
Sometimes, we may need to add functionality to existing threadsafe classes, like in SynchronizedList, add() and contains() methods are threadsafe, but their combination is not.
Suppose, we need to add putIfAbsent method to SynchronizedList which adds an object to the set if it is not present. There are many ways
can modify existing class using the same lock
not possible in some cases(like in Synchronized List) because SynchronizedList is not our class, we only have its jar.
extend the class(like make PutIfAbsentSynchronizedArrayList)
This may not work with all classes because some classes may not be designed for extension(You cannot extend SynchronizedList)
This may not work because not all classes expose enough of their state to add a new synchronized method(You have to take the lock on the same object on which your base class took a lock, remember).
Even if it works, it is very fragile, because the locking policy is now distributed over a set of classes, and if the base class changes
it policy, like uses lock on another object, the subclass will fail(and fail silently!!) big time
Use client side locking
can be used in case of Collections.synchronizedList cases.
Create a class, have instance as Collection.synchronizedList, and put synchronize.
should synchronize on the same lock as held by Collections.synchronizedList.
even more fragile, because the locking strategy is distributed.(same as above method)
Create a class like PutIfAbsentSyncArrayList which implements List and has the list as instance variable.
implements all list methods, delegate some methods to the parent, and implement the putIfAbsent() method yourself using appropriate synchronization.
it can put synchronization whereever it needs.
a decent way to implement, much better than others
Synchronized Collection Classes are thread safe but you may need additional client side locking
ConcurrentSkipListMap is a synchronized Sorted Map(TreeMap with synchronization)
ConcurrentSkipListSet is a synchronized Sorted Set(TreeSet with synchronization)
holds a list of elements avaiting processing
ConcurrentLinkedQueue is a FIFO queue
PriorityQueue is a non concurrent, priority ordered queue.
Queue operations do not block, ie if the queue is empty, get/dequeue operation returns null
LinkedList is also a type of Queue
Queue classes were added because the list's requirement of random access did not allow efficient concurrent implementation of List
extends Queue to add blocking insertion and retrieval operations.
If the queue is empty, get/dequeue operation blocks, while if the queue is full, put/enqueue operation blocks.
useful for Producer Consumer designs
arbitrarily many reader and a limited no of writers can access the map
higher throughput for concurrent access with little slower for single threaded access.
lock striping locking strategy
Uses "weakly consistent iterators" instead of "fail safe iterators", iterators which dont explicity fail or throw exception if the element is not found.
size() and isEmpty() may not always return current value.
does not provide the ability to lock itself for exclusive access
replacement of synchronized List
offers better concurrency and offers the need to lock or copy the collection during iteration.
derives its thread safety from fact that as long as effectively immutable object is properly published, no synchronization required.
implements immutability by creating and reproducing a new copy of collection when its modified. retains a reference to the backing aray,
there is a cost involved because the list is copied whenever it is modified.
useful when iteration in the list is much more common than modification.
Blocking Queue and the Producer-Consumer Pattern
blocking queues are powerful resource management tool for building reliable applications.
make your program more robust to overload by throttling activities that threaten to produce more work than can be consumed.
provide blocking put() and take() methods
if queue is full, put() blocks until space is available.
if queue is empty, take() blocks until an element is available.
offer() method returns a failure status and does not block if the element cannot be enqueued.
useful in solving the producer consumer problem.
producer put data on the queue whenever space is available, and consumer take data from queue whenever an element is available.
common producer-consumer design is thread pool coupled with a work queue.
Types of Blocking Queue
PriorityBlockingQueue which can compare elements according to natural order or using a comparator.
SynchronousQueue is not a queue at all, nothing between producers and consumers, producers hand over directly to consumers.
Deque and BlockingDeque
Deque extends Queue and allows insertion at head and removal at tail
used in 'work stealing', where one thread can help out another thread by doing its tasks if it is free.
normally used when work leads to more work, like in web crawling or heap marking during GC
when a worker identifies new work, it puts it at the end of its own deque(or someone else's deque).
Type of Deque are
Threads may block or pause for various reasons
while waiting for I/O completion.
while waiting to acquire a lock.
while waiting to wake up from Thread.sleep
while waiting for the result of computation in another thread.
when a thread waits, it is suspended, and and placed in one of the blocked states (BLOCKED, WAITING, TIMED_WAITING)
when it is done, it is placed in RUNNABLE state again.
All blocking methods normally throw Interrupted Exception when they are interrupted.
If your method calls a method which throws Interrupted Exception, your method is also blocking.
If you catch the interrupted exception, you can do either of the following
Propagate the Interrupted Exception by rethrowing it
Restore the interrupt by calling interrupt on the current thread.
synchronizer is any object that coordinates the control of threads based on the state.
Blocking Queue can act as synchronizers.
Other synchronizers include Semapahores, Barriers, Latches
All synchronizers encapsulate state that determine whether the threads should be allowed to do the following
should be allowed to pass or not.
should be allowed to manipulate that state or not.
should be allowed to let the synchronizer enter the desired state
Latch is a synchronizer that can delay the progress of threads until they reach a terminal state.
Acts as a gate, untill the terminal state, gate is closed, and no thread can pass, at terminal state, gate opens, all threads pass.
Once a latch changes state and gate opens, it cannot change state again.
can be used for the following practical applications
ensuring a computation does not start, till all its resources have assembled(simple binary Latch can be used in this)
Ensuring a service does not start till services which depend on it have initialized, each service will have its binary latch.
waiting for all the players in a multiplayer game.
type of latch which allows one or more threads to wait for a set of events to occur.
counter initialized to n, the no of positive events to happen, decrements the counter
Counting Semaphores are used to control the no of activities that can access a resource or perform an action at a particular time.
used to implement resource pools.
used to put bounds on a shared collection.
manages a set of virtual permits, normally passed to it as a parameter.
Activities can acquire or release permits when they are done with it.
acquire blocks until a given resource is available, or until timeout.
Binary Semaphore is a semaphore with an initial count of 1, can be used as a mutex.
used for implementing resource pools like database connection pools.
similar to latches that they block a group of threads until some event has occured.
However, with a barrier, all threads must come to barrier at the same time in order to proceed.
Latches are used for waiting for events, barriers are used for waiting for other threads.
Everyone meet at a place at 12:00 PM, once you reach there, stay there and wait for everyone to come.
Cyclic Barrier is useful in iterative algorithms that break problems into smaller independent problems.
Threads call await when they reach the point, If all threads meet at a barrier point, barrier is released and reset.
Cyclic Barrier Vs Countdown Latch
If the barrier is broken, it can be reused, (because its a 'cyclic' barrier) but a count down latch cannot be reused.
implements Future, which describes an abstract result which can be computed.
computation of FutureTask is implemented by Callable, (callable is nothing but something which can be called, like a Runnable).
also acts as a latch.
Behaviour of Future.get depends on state of the task, if completed, it returns the result immediately, otherwise it blocks and gets.
represents the lifecycle of a task and provides methods, to test whether the task has completed or has been cancelled,
Future's get() method, returns immediately or throws an exception if its completed, but blocks until the task completes.
manages a homogenous pool of worker threads.
Threads is thread pool have simple life, serve the request when the request is made for the next task, finish it, and go back to the pool.
Types of Thread Pool Executor
fixed size thread pool
increases the no of threads to limit one by one.
no bounds on size of the pool
single thread to execute tasks
the thread is recreated if it dies(mostly because of an exception)
fixed thread pool
supports delayed and periodic task execution.
replaces timer, because Java Timers have some drawbacks like the following
if a timer takes too long to run, the timing accuracy of other timertasks can suffer.
if an unchecked exception occurs in timertasks, it somehow just cancels the thread.
combination of Executor and BlockingQueue
Let's say you have an executor, which creates some threads for computing some data, and you submit a number of such tasks to the executor. Also, you want to get the results of the computation tasks as they become avaiable. One way is to repeatedly check by trying to do get to see if the task finished, which is not very efficient. CompletionService comes to our rescue.
We can submit callable tasks to it for completion, and use queue like methods take() and poll() to retrieve results, packaged as futures when they become available.
Cancellation and Shutdown
interruption is a cooperative mechanism that enables one thread to ask another to stop what it is doing.
normally done by volatile boolean flag that is repeatedly checked to see whether the thread/task has been cancelled.
calling interrupt does not stop, it just delivers the message to stop, and thread code should normally check for interruption.
lets say, in case of producer-consumer problem, producer may be in blocked state using put(), (because the queue is full), but somehow the consumer
has stopped, so, the producer will always remain blocked
Reasons for cancellation
One of the below
user can request cancellation
a task which should stop at a particular time
application events, like a user pressing the stop button.
error in crawlers etc
on application shutdown, all threads which it spawned should be cancelled.
Most blocking library methods like wait(), notify(), notifyAll(), sleep() throw interrupted exception in case they are interruptted.
While writing the code, whenever you check the interrupt flag, and find that the thread has been interrupted, you should do either of the following
Propagate the Interruped exception to the caller, if you think the caller code should handle the fact that the thread was cancelled.
Interrupt the current thread, if you think the current thread is also rendered useless because of it.
daemon threads are threads which provide helper functions but don't prevent the JVM from shutting down.
When JVM starts, Main thread is the normal thread, while all other threads, like for GC, housekeeping are daemon threads.
when a thread is started, it inherits the status of its parent.
When a thread exits, JVM checks if only daemon threads are remaining, if yes, it orders a orderly shutdown.
Daemon threads should be rarely used by code, except like a thread which removes expired entries from an in-memory cache.
Long running tasks
Thread pools can be very slow to problem if tasks can block for extended period of time.
A thread pool can become clogged with long running tasks, and even small tasks can take longer.
tasks should use time bounded waits instead of unbounded wait.
Sizing Thread Pools
should normally use Runtime.availableProcessors
N-threads = N-cpu * U-cpu * (1 + w/c)
N-threads = No of threads in thread pool
N-cpu = No of CPU
U-cpu = target CPU utilization
w/c = wait time/compute time.
Above is the approximate formula for calculating the no of threads required in a thread pool.
Other resources that decide size are memory, file handles, socker handles, and DB connection.
a combination of above and the formula will an upper bound on the threads in a thread pool.
Saturation Policies (when limit in thread pools is exceeded)
saturation policy of ThreadPoolExecutor can be modified by calling setRejectedExecutionHandler()
available implementations are AbortPolicy, CallerRunsPolicy, DiscardPolicy, DiscardOldestPolicy.
AbortPolicy is the default and causes unchecked RejectedExecutionException.
discardOldest discards the element that will be executed next.
discardOldestPolicy and PriorityQueue is a bad idea
CallerRunsPolicy will run it in the main thread only, without creating thread, so others have time to catch up.
in web server, for callerRuns policy, load goes to TCP layer, so, a more graceful exit is there.
When a threadPool creates a thread, it creates a normal thread and you cannot customize it.
Specifying ThreadFactory allows you to customize the configuration of new Threads.
ThreadFactory has one method "newThread" which is called whenever ThreadPool wants to create a new Thread.
may be you just give the thread a name in this method.
One example is Dining philosopher's problem.
one solution is each thread occupies only when both are free, and releases otherwise, and each thread checks after random intervals.
JVM is not good in resolving the deadlocks but a database servers is good in resolving deadlocks.
we should try to ask locks in the same order.
but taking locks in same order does not solve dining philosopher's problem(if both acquire lock of objects on left first )
cannot always use above, if we have to transfer the money from account 1 to account 2.
For bank transfer problem, Use system.identityHashCode to get the hashCode, and order it in a way, like increasing order of acquiring locks.
Invoking an alien method(method which does not belong to your class)with a lock held is asking for trouble, the alien method might acquire more locks.
calling a method with no locks is called an open call, which is quite good.
Just as threads can deadlock while waiting for objects, they can also deadlock while waiting for resources.
Resource Pools are usually implemented with Semaphores to facilitate blocking when pool is empty.
for avoiding deadlocks, we can do tryLock() of the lock object, which is a timed version.
ThreadDump includes a stacktrace of each running thread, also includes locking information.
kill -3 for a thread dump.
starvation can happen when a thread is denied access to a resource.
it can also happen when one thread acquires a lock and goes into a big loop.
form of liveness failure in which a thread is not blocked but cannot make progress, because it keeps retrying an operation that will always fail.
can occur in messaging systems, where a messaging infrastructure rolls back a transaction puts it at the head of the queue.
this is called the poison message problem.
can also occur when two threads change their state, waiting for other thread to finish their work, may be like the dining philosopher's problem.
solution is to introduce some randomness.
Performance and Scalability
multiple threads require some overhead, introduces some performance cost.
cost included are in form the following.
most concurrent programs consist of a mix of parallelizable and serial portions.
SpeedUp = 1/(F+ (1-F)/N)
F is the fraction of computations that must be done serially.
N is the no of machine processors.
As N reached infinity, program with 50% serial work can be done, twice as fast.
As no of processors increase, the percentage improvement is significantly impacted by % of serial work.
vmstat and mpstat on UNIX system tell about system performance.
you can check whether the application is disk bound using "iostat" or "perfmon" and whether it is bandwidth bound by checking for network traffic.
ReentrantLock implements Lock.
defines a no of abstract locking operations.
lock offers a choice of unconditional, polled, timed, interruptible lock aquisition,