This is the third article discussing the challenges of implementing high concurrent web applications. In the previous two articles, we mainly focused on the challenge for handling concurrent requests and the two design patterns of handling concurrent requests. In this article, we are going to discuss the challenges in designing highly concurrent business logic.
Most of professional developers have developed thread based concurrent applications in the past. If you are a Java developer, it’s likely you’ve used
runnable interfaces and
executor services, you probably also used
completable future to handle the results of your forked concurrent threads. But have you ever wonder why we want to design concurrent business logic? The answer is simple: to boost the performance of the applications. There are two types of activities in general:
I/O Bound activities and
CPU bound activities.
I/O bound activities are tasks that mainly limited by I/O resources, for example network I/O, or database I/O. As we explained in the previous posts, I/O operations is thousands more times slower than CPU operations. Assuming the application logic runs in sequential, while the CPU waits for the I/O operations to finish, the application can’t proceed.
CPU bound activities are tasks that are primarily consume CPU time during executions. Some algorithm can be very complex, while they don’t involve heavy I/O operations, they requires the CPU resources quite intensively. For example, video processing and image processing.
When we deploy our applications in the services, often we get more than one CPU. It’s natural that we want to fully leverage these extra computing powers when we implement CPU heavy applications. A simple way is to divide the task into several subtasks, then schedule the subtasks into different CPUs to have them run concurrently, and finally synchronize the result.
- Thread Model and Race Conditions
One model to handle the divided tasks is to leverage the thread pool: run the divided tasks in a thread. The thread pool is to limit the number of concurrently running thread, in the mean time, save the overhead of creating new threads. However, this approach caused an inevitable problem:
Race condition happens when multiple process/thread try to get access to the same resource. Why race conditions happens in thread model? Because threads
share states. The
thread within the business application is different from the
thread based web service handling concurrent requests. These threads are forked by the main thread, and the forked thread often have interaction with the main thread.
One way of handling main thread <> forked threads interaction is through shared variables. For example, the forked threads save the results back into a
global variable. And all threads will have access to these variable, hence race condition can happen.
Why race condition is a problem for programming? Because the result of the program becomes indeterministic. As we mentioned above, the control flow of the thread model is largely sequential, when the intermediate state of the thread is changed, the final result is different.
How do we prevent race condition? Making sure only one thread have access to the shared state at one time. And one achieving that is to introduce locks. Locks make sure the shared state is
locked and only be accessed by one or limited number of threads.
The most common lock implementation is a binary semaphore. The binary semaphore use a shared variable to control the access, and two operation: wait and signal. The two operations are always used together. In the wait operation, it check the status of the lock, if locked keep waiting, otherwise change the lock from unlock to locked so other operations can’t access it. This operation is atomic hence won’t cause race condition. Once the critical operations are finished, it release the lock so other thread can acquire it.
In Java programming, we use
synchronized keyword to lock the variable or code block. However, there are risks using locks. On the one hand, using it incorrectly can cause many lock issues say dead lock or lock starvation, on the other hand, when the lock is too granular, it breaks the control flow into small blocking chunks, the overhead of lock checking offset the performance gain from multi threading.
While thread based muti-thread programming benefits the CPU bound activities pretty well, it doesn’t scale for I/O bound activities. As discussed in previous articles, in I/O bound activities, I/O is the final bottleneck, having multiple threads accessing the I/O might help in the beginning, as most of the I/O device supports concurrent operations in one way or another, but we reach the concurrent bottleneck, the latency would increase.