In this blog post, we learn why is it faster to process sorted array than an unsorted array? We will see a C++ code to check the performance of the sorted and unsorted array. In C++, it is faster to process a sorted array than an unsorted array because of branch prediction.
Here is a C++ code that illustrates that sorting the data miraculously makes the code faster than the unsorted version. Let’s try out a sample C++ program to understand the problem statement better.
Unsorted Array:
Here we are creating an unsorted array and analyzing the processing time.
#include <algorithm> #include <ctime> #include <iostream> int main() { // Generate data const unsigned arraySize = 32768; int data[arraySize]; for (unsigned c = 0; c < arraySize; ++c) { data[c] = std::rand() % 256; } // Test timing clock_t start = clock(); long long sum = 0; for (unsigned i = 0; i < 100000; ++i) { // Primary loop for (unsigned c = 0; c < arraySize; ++c) { if (data[c] >= 128) sum += data[c]; } } double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC; std::cout << elapsedTime << std::endl; std::cout << "sum = " << sum << std::endl; return 0; }
Output:
Sorted Array:
Now we are sorting the array using the sort function and analyzing the processing time of sorted array.
#include <algorithm> #include <ctime> #include <iostream> int main() { // Generate data const unsigned arraySize = 32768; int data[arraySize]; for (unsigned c = 0; c < arraySize; ++c) { data[c] = std::rand() % 256; } //Sorting the array std::sort(data, data + arraySize); // Test timing clock_t start = clock(); long long sum = 0; for (unsigned i = 0; i < 100000; ++i) { // Primary loop for (unsigned c = 0; c < arraySize; ++c) { if (data[c] >= 128) sum += data[c]; } } double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC; std::cout << elapsedTime << std::endl; std::cout << "sum = " << sum << std::endl; return 0; }
Output:
Observe that time taken for processing a sorted array is less as compared to the unsorted array. The reason for this optimization for the sorted array is branch prediction.
What is branch prediction?
In computer architecture, branch prediction means determining whether a conditional branch(jump) in the instruction flow of a program is likely to be taken or not. All the pipelined processors do branch prediction in some form because they must guess the address of the next instruction to fetch before the current instruction has been executed.
Why is processing a sorted array faster than an unsorted array?
Let consider the above-mentioned example where sorted array processing is faster in comparison to the unsorted array.
if (data[c] >= 128) sum += data[c];
Case 1: Sorted Array
Notice that the data is evenly distributed between 0 and 255. When the data is sorted, roughly the first half of the iterations will not enter the if-statement. After that, they will all enter the if-statement.
This is very friendly to the branch predictor since the branch consecutively goes the same direction many times. Even a simple saturating counter will correctly predict the branch except for the few iterations after it switches direction.
Quick visualization:
T = branch taken N = branch not taken data[] = 0, 1, 2, 3, 4, ... 126, 127, 128, 129, 130, ... 250, 251, 252, ... branch = N N N N N ... N N T T T ... T T T ... = NNNNNNNNNNNN ... NNNNNNNTTTTTTTTT ... TTTTTTTTTT (easy to predict)
Case 2: Unsorted Array
However, when the data is completely random, the branch predictor is rendered useless, because it can’t predict random data. Thus there will probably be around 50% misprediction (no better than random guessing).
A branch prediction works on the pattern the algorithm is following or basically the history, how it got executed in previous steps. If the guess is correct, then CPU continues executing and if it goes wrong, then CPU needs to flush the pipeline and roll back to the branch and restart from the beginning.
data[] = 226, 185, 125, 158, 198, 144, 217, 79, 202, 118, 14, 150, 177, 182, 133, ... branch = T, T, N, T, T, T, T, N, T, N, N, T, T, T, N ... = TTNTTTTNTNNTTTN ... (completely random - hard to predict)
How to increase the performance of the unsorted array?
If the compiler isn’t able to optimize the branch into a conditional move, you can try some hacks if you are willing to sacrifice readability for performance.
So let see an example,
If in the above code we remove the if condition with some hack statement, it definitely increases the performance.
if (data[c] >= 128) sum += data[c]; Replace With || \/ int t = (data[c] - 128) >> 31; sum += ~t & data[c];
Now let see the performance of the above changes with unsorted array on the same platform.
#include <algorithm> #include <ctime> #include <iostream> int main() { // Generate data const unsigned arraySize = 32768; int data[arraySize]; for (unsigned c = 0; c < arraySize; ++c) { data[c] = std::rand() % 256; } // Test timing clock_t start = clock(); long long sum = 0; for (unsigned i = 0; i < 100000; ++i) { // Primary loop for (unsigned c = 0; c < arraySize; ++c) { int t = (data[c] - 128) >> 31; sum += ~t & data[c]; } } double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC; std::cout << elapsedTime << std::endl; std::cout << "sum = " << sum << std::endl; return 0; }
Output:
Note: This hack is not strictly equivalent to the original if-statement And the performance of the code could be different on different platforms.
Recommended Articles for you:
- How to create dynamic array in C?
- How to pass an array as a parameter in C?
- A brief description of the pointer in C.
- Introduction of Array in C.
- Dangling, Void, Null and Wild Pointers
- Function pointer in c, a detailed guide
- How to use the structure of function pointer in c language?
- Function pointer in structure.
- Pointer Arithmetic in C.
- void pointer in C.
- 10 questions about dynamic memory allocation.
- How to use the structure of function pointer in c language?
- Memory Layout in C.
- 100 C interview Questions
- Implement state machine in C.
- Function pointer in structure.
- What is flexible array member in c?
- What is importance of struct hack in c?
- How to use the structure of function pointer in c language?
- Create a students management system in C.
- Create an employee management system in C.
- Top 11 Structure Padding Interview Questions in C
- File handling in C.
- C format specifiers.
References :