Algorithms cheatsheet

Here goes a list of useful and common algorithms explained in a way that is easy to understand and beside their names I have written their average case runtimes. Reading this article should give you insights about which algorithm to use and how to implement it. If a programmer could be a wizard then the algorithms would be the spells, the more you know, the more tricks you can do.

Binary search (O(log2(n)), works in place)

Let n = size of the array.
Let min= 0 and max=n-1.
If max<minthen stop, the target is not present in the array, return -1.
Calculate guess as floor((max + min) / 2).
If array[guess] equals the target then stop, you found it, return guess.
If array[guess]<thetarget then set min=guess+ 1and jump to step 8.
if array[guess]> the target then set max=guess- 1.
Go back to step 3.

Uniform binary search (O(log2(n)), works in place)

Let sizeArray be the size of the array we will search, now calculate the size of the lookup table with the formula size = ceil(log2(sizeArray)) + 1.
Now let i = 1.
Add the element Delta(i) = floor(sizeArray + 2 ^ (i - 1)) / 2 ^ i to the lookup table called midpoints.
Let i += 1 and if i > size then go to step 6.
Go back to step 3.
Then let i = 0 and mid = midpoints[0] - 1, you decrement 1 to make up for the fact they were calculated starting at 1 and not at 0.
If mid < 0 or mid >= sizeArray then the element is not here, return -1.
If array[mid] is equal to the target then return mid, you've found it.
If i equals size - 1 then we've reached the end of the midpoints and therefore the element isn't here, return -1.
If array[mid] < the target then let mid += midpoints[i + 1], otherwise let mid -= midpoints[i + 1].
Increment i and go back to step 7.

XOR swap (O(1), works in place)

Remember the word BABA.
B ^= A; A ^= B; B ^= A;
This is it, the values of A and B have been swapped. This algorithm has no advantages whatsoever over the traditional swap methods.

Deterministic quick sort (theta(nlog2(n)), works in place)

Call the function with the array passed by reference, create a variable begin = the index of the first element of the array and a variable end = index of the last element of the array.
Take the element array[end] as the pivot.
Create a variable called boundary = begin and another one called u = begin.
If u = end then skip to step 7.
If array[u] <= array[end] then swap array[u] with array[boundary] and increment boundary.
increment u and go back to step 4.
Swap array[end] with array[boundary].
If boundary > begin + 1 then call this quick sort function recursively with parameters (array, begin, end = boundary - 1).
If boundary < end - 1 then call this quick sort function recursively with parameters (array, boundary + 1, end).

Randomized Quick sort (theta(nlog2(n)), works in place)

The same algorithm as the Deterministic quick sort but you must go to the step 2 of the Deterministic quick sort algorithm and choose 3 random elements between the indices begin and end, get the median of those 3 and swap it with the rightmost element, so it is the pivot. This will give you a higher chance at getting a good split and you can use less than 3 elements to make it simpler or more than 3 elements to make this chance even higher. My experiments have shown it is fastest when you use an if statement to do it randomized only when the array size is at least 30, furthermore, using only 1 random element is faster than using 2, which is faster than using 3.

Merge sort (theta(nlog2(n)), doesn't work in place)

If the size of the array is <= 1 then skip to step 12.
Calculate the midpoint as the floor of the size of the array divided by 2.
Create a new array called lowerHalf which is equal to the return of the merge sort function on array[0 .. midpoint].
Create a new array called upperHalf which is equal to the return of the merge sort function on array[midpoint .. end].
Append a sentinel with a value equivalent to infinity to both the lowerHalf and the upperHalf arrays, this will prevent violating the array's bounds.
Create 3 integers, originalI, lowerI and upperI to keep track of the indices of the original array, the lowerHalf array and the upperHalf array, respectively, they all start with the value 0.
If originalI >= length of array then skip to step 12.
If lowerHalf[lowerI] <= upperHalf[upperI] then make array[originalI] = lowerHalf[lowerI], otherwise skip to step 10.
Increment both originalI and lowerI, then go back to step 7.
Make array[originalI] = upperHalf[upperI].
Increment both originalI and upperI, then go back to step 7.
Return the array.

Insertion sort (theta(n^2), works in place)

Let i = 1.
If i is >= the length of the array then stop because you have reached the end, meaning it is now sorted.
Let key = array[i]. The variable i will mark the unsorted part boundary and key will be the first element of the unsorted part.
Let j = i - 1. j will be used to check all the elements in the already sorted part, we descend the list until we find where to put key.
If array[j] > key then make array[j + 1] = array[j]. We push the elements up to make room for key in front of them.
If array[j] <= key then make array[j + 1] = key, i++ and go back to step 2. Here you have found where to put key and then we move up the boundary that separates the sorted part from the unsorted part.
Decrement j, we will now check the next element down the list.
If j is equal to -1 then make array[0] = key, increment i and then go back to step 2. In this case key had to go in the very beginning.
Go back to step 5.

Selection sort (theta(n^2), works in place)

Let i = 0.
If i is equal to the length of the array - 1 then stop, you're done.
Let minIndex = i and j = i + 1.
If array[j] < array[minIndex] then minIndex = j.
Let j += 1.
If j is equal to the length of the array then swap the values of array[i] and array[minIndex], then increment i and go back to step 2.
Go back to step 4.

Bubble sort (theta(n^2), works in place)

Let i = 1.
If i >= length of the array then stop, you're done.
Let j = 0.
If array[j] > array[j + 1] then swap their values.
Increment j, if j is equal to the length of the array - i then increment i and go back to step 2.
Go back to step 4.

Counting sort (theta(R + N), doesn't work in place)

R is the limit of the range of numbers (if the biggest number is 6, then R is 7) and N is the size of the array.
Create an array called equals with a length equal to R and set all values to 0.
Let key = the next number in the array.
Increment equals[key], that way you will store how many times each element appears in the array.
If you haven't reached the end of the array then go back to step 3.
Create an array called smallers with a length equal to R and set the first value to 0.
Create i = 1.
Do smallers[i] = equals[i - 1] + smallers[i - 1], that way you will be counting how many numbers come before each key in the array.
If i < R then increment i and go back to step 8.
Now create a new array result with the length equal to N.
Now make i = 0.
Make key = array[i] and then make index = smallers[key].
Then make result[index] = key and increment smallers[key].
If i < N then increment i and go back to step 12.
Return result, which is now sorted in a stable way.
This algorithm only works when the array only contains integers.

Radix sort (theta(R + N), doesn't work in place)

This algorithm is useful when you are sorting an array of equal size strings, such as ["K1E", "R3T", "Y7U"], R is the range of values for each character and N is the size of the array. In this case R would be 36 because we have 10 digits + 26 characters.
Make i = size of a string, in this case i = 3.
Create a new array result with length N.
Apply the counting sort algorithm above to the array, but only take into consideration the character in the index i - 1, meaning the rightmost character, insert the elements in result.
If i = 0 then the array result is now sorted in a stable way, return it.
Decrement i, make the original array = result and go back to step 4.

Breadth-first search (O(|V| + |E|), complete)

Start with all vertices having their distance = infinity. Choose the origin vertex and set its distance to 0 and its predecessor to null.
Create a queue with only the origin vertex in it.
Dequeue the 1^st element of the queue and make it the current vertex.
Visit all adjacent vertices of the current.
If the adjacent vertex's distance <= current vertex's distance + edge's weight, then skip to step 9.
Set the adjacent vertex's distance to be the current vertex's distance + the edge's weight.
Set the adjacent vertex's predecessor to be the current vertex.
If the adjacent vertex is not the destination then enqueue it.
If the queue isn't empty then go back to step 3.
Make the current vertex be the destination vertex and create a result list.
Add the current vertex to the result list.
If the current vertex is the origin then just return the result list, which will be the desired BFS Ordering, and skip the next step.
Make the current vertex be its predecessor and go back to step 11.

Depth-first search (O(|V| + |E|), not complete)

Start with all vertices having their distance = infinity. Choose the origin vertex and set its distance to 0 and its predecessor to null.
Call a recursive function dfs(graph, origin, destination), passing it the whole graph, the origin vertex and the destination vertex, pass the first 2 parameters by reference.
Inside the recursive function you should start checking all the adjacent vertices of the current vertex, each of them will be the visited vertex.
If there are no more adjacent vertices to be checked then exit the function call and skip to step 10.
If the adjacent vertex's distance <= current vertex's distance + edge's weight, then go back to step 3.
Set the adjacent vertex's distance to be the current vertex's distance + the edge's weight.
Set the adjacent vertex's predecessor to be the current vertex.
If this adjacent vertex isn't the destination then call the dfs(graph, visited, destination) function with the visited vertex as a parameter.
Go back to step 3.
Create a result list and then create the current vertex and make it equal to the destination vertex.
Add the current vertex to the result list.
If the current vertex is the origin then just return the result list, which will be the desired ordering, and skip the next step.
Make the current vertex be its predecessor and go back to step 11.

Depth-limited DFS

The same implementation of the DFS algorithm above, however the dfs() recursive function should have another parameter called limit and you should use an if statement before making another call to the dfs() function, to check if it reached the recursion limit.
Remember to decrement limit at each new recursive call, stop making new calls when limit is 0.

Sierpinski gasket (theta(3^(log2(window's width))))

Define a function called sierpinski(int left, int top, int right, int bottom, Window window).
In the main() function you create a square window, for example a 1000x1000 window.
Call the sierpinski() function by passing the window along with its dimensions as arguments, as in sierpinski(0, 0, 1000, 1000, window).
From now on we are inside the sierpinski() function and we will use the arguments left, top, right and bottom to draw the lines and rectangles.
If right - left <= 5 or bottom - top <= 5 then we have reached the base case, draw a square (left, top, right, bottom), fill it with a color and skip the next steps.
Create the variables widthMiddle = (right - left) / 2 and heightMiddle = (bottom - top) / 2.
Draw a line from point (left, top + heightMiddle) to point (right, top + heightMiddle).
Draw a line from point (left + widthMiddle, top) to point (left + widthMiddle, bottom).
Recursively call the sierpinski() function with the arguments (left, top, left + widthMiddle, top + heightMiddle, window).
Recursively call the sierpinski() function with the arguments (left + widthMiddle, top, right, top + heightMiddle, window).
Recursively call the sierpinski() function with the arguments (left + widthMiddle, top + heightMiddle, right, bottom, window).

Middle squares method (theta(n))

Create an empty list of numbers and choose a random number to be the seed.
Create a variable seedLength = the number of digits in the seed.
Add the seed to the list of numbers.
Let squared = the seed ^ 2.
If the length of squared is <= seedLength (it happens if the middle number was very small), then make the seed = squared and skip to step 7.
Pick the digits in the middle of squared, the number of digits must be equal to the length of the seed, for example, if the seed was 704, then squared = 495616, therefore you will pick 561 (49[561]6) to be the next seed. Notice you find the middle point to the right, not to the left.
If the seed is neither 0 nor a number which is already in the list, then you haven't reached the period yet, go back to step 3.
Return the list of numbers.

Fermat's primality test

The tested number is called N.
Pick a random number called A which is bigger than 1 but no bigger than the square root of N.
Then find the GCD between N and A.
If the GCD is not 1 then N is composite, skip the next steps.
Apply Fermat's Little Theorem between N and A, by writing (A ^ N) mod N = A.
If the expression above is false then N is composite, skip the next step.
Go back to step 2 and repeat the whole process, each time you repeat this process you halve the chance of error, which is 1 / 2 ^ trials.

Binary exponentiation

Take the number N and the exponent E.
If E = 1 then return N and skip the next steps.
Calculate M = N ^ 2.
If E is odd then skip to step 7.
Calculate D = E / 2.
Recursively calculate M ^ D and skip the next steps.
Calculate D = (E - 1) / 2.
Recursively calculate N * M ^ D.

Notes

In order of fastest to slowest sorting algorithm we have: Quick Sort > Merge Sort > Insertion Sort > Selection Sort > Bubble Sort.
There are some variations of Quick Sort, the randomized version is faster than the deterministic version in worst case scenarios, but slower in the average scenarios. You can make it much faster by creating a hybrid which uses Insertion Sort on the array pieces that have few elements.
The existential lower bound for comparison sorting algorithms is omega(nlog2(n)) and the universal lower bound for all sorting algorithms is omega(n).
If the recursive algorithm is taking too long then use memoization, it should make it faster.
Bottom-up approach is when you start with the smallest value and iteratively use it to calculate up until the target value.
You can use Dynamic Programming every time you have optimal substructure and overlapping subproblems.
Divide and conquer can be summed up as 3 steps: divide (split it in subproblems), conquer (solve the subproblems) and combine (join the subresults to solve the original problem).
When using a pseudorandom number generator you should use a seed of at least 10 digits so it will look truly random.
The Middle Squares Method is very limited, the period is usually very small even if you use a very big seed.
Fermat's Primality Test always has an error chance, but it runs much faster than the traditional test.
BFS is guaranteed to find a solution, if it exists, whereas DFS may be lost in an infinite branch, unless you use the Iterative Deepening DFS.
DFS is not as optimal as BFS but it uses much less memory because it doesn't need to build the entire tree before the search begins.
If your graph is a tree then the only difference between BFS and DFS is that the former uses a queue while the latter uses a stack.

Search This Blog

Computing the sciences: ideas for computer scientists.

Algorithms cheatsheet

Comments

Post a Comment