I read The Algorithm Design Manual by Steven S Skiena on Apr 11th, 2018

Thanks SurveyMonkey’s Take 4 policy, I took a 4-week sabbatical to recharge after serving the company more than 4 years. During my visit to China, I also spent some time to hone my problem-solving skills. The journey is mainly guided with The Algorithm Design Manual, Manual thereafter, and use the CLRS as a reference.

The Algorithm Design Manual

Introduction to Algorithm Design

Algorithm is designed to solve a problem, thus a clear problem definition acts as the corner stone toward the solution:

After the solution is proposed, we need to prove its correctness; or seeking an counter-example to prove its incorrectness:

Induction and recursion are widely used to build the proof chain.

Algorithm Analysis

The Big Oh notation defines function bound as:

The master theorem: assume T(n)=aT(n/b)+f(n)T(n) = a T(n/b) + f(n) for a1a \ge 1 and b>1b > 1,

  1. If f(n)=O(nlogbaϵ)f(n) = O(n^{\log_b{ a - \epsilon }}) for some constant ϵ>0\epsilon > 0, then T(n)=Θ(nlogba)T(n) = \Theta(n^{\log_b{a}}).
  2. If f(n)=Θ(nlogba)f(n) = \Theta(n^{\log_b{a}}), then T(n)=Θ(nlogbalgn)T(n) = \Theta(n^{\log_b{a}}\lg{n}).
  3. If f(n)=Ω(nlogba+ϵ)f(n) = \Omega(n^{\log_b{a+\epsilon}}) for some constant ϵ>0\epsilon > 0, and if af(n/b)cf(n)a f(n/b) \le cf(n) for some c<1c < 1, then T(n)=Θ(f(n))T(n) = \Theta(f(n)).

Data Structures

The Manual categorize the data structures into basic data structures such as Array and concepts, such as Dictionaries. The concepts expose interfaces to manipulate the internal states, the implementations MAY vary for different use cases.

Array

Array is continuously-allocated data structure, its simplicity pays back as:

The amortized time complexity of dynamical array is the same as O(n)O(n) as single array.

Hashing

The Manual sees the hashing from the information theory perspective: it extracts features from a high dimensional data, think about PCA.

Binary Search Tree(BST)

BST is an very effective data structure for data retrieval if the height of the tree is O(logn)O(\log{n}). The performance MAY degrade if the tree is not-balanced. Maintaining the balance in the insertion and deletion operation is not free, there are many balanced BST variances to strike the balance between the perfect balance and operation cost, notably:

Dictionary

It is tricky to pick the most efficient implementation for specialized use cases:

However, in practice, it is more important to avoid using a bad data structure than to identify the single best option available.

Rule of thumbs:

Priority Queue

We can use the following data structures to implement the priority queue:

Set

The Set data structure is alike the Dictionary, but focusing on the set-specific operations, such as:

The implementation might be:

Sorting and Searching

Sorting is the cornerstone to solve other problems with the price tag O(nlogn)O(n\log{n})

Most O(nlogn)O(n\log{n}) complexity sort algorithm are NOT stable:

The heap compress the binary tree perfectly in the array, and it allows us to insert new elements and maintain the order.

Q: The upper bound of the sorting algorithm is O(nlogn)O(n\log{n}), why?

A: Assume there exists no duplicated keys, the sorting algorithm aims to find one and only one ordering in all n!n! permutations, which is bound by the O(logn!)O(\log{n!}):

logn!=_i=1nlogi=O(nlogn)\log{n!} = \sum\limits\_{i=1}^n \log{i} = O(n\log{n})

Dynamic Programming (DP)

Dynamic Programming, aka DP seeks the global optimization. Unlike the greedy algorithm, which achieve the global optimal by summing up the local optimal; the DP stores the consequences of all possible decision to achieve the global optimal.

The DP is typically implemented as the following by CLRS:

  1. Characterize the structure of an optimal solution.
  2. Recursively define the value of an optimal solution.
  3. Compute the value of an optimal solution, typically in a bottom-up fashion.
  4. Construct an optimal solution from computed information.

In practice, I find the top-down with memoization approach is more intuitive, but the top-down approach requires O(n)O(n) extra space to store the intermediate results, and invokes O(n)O(n) recursive calls. It MAY cause stack overflow, and incurs more overhead. See my deep dive here.

Backtracking

Backtracking tries to find all(or some) solution by exploring the configuration space with a depth-first search approach. It constructs a tree of partial solutions, then incrementally extends the candidate towards the solution, and abandons(prune) a candidate as soon as the candidate cannot possibly be completed to a valid solution.

The general steps for backtracking are:

  1. Iterate all candidates, make one step of move to integrate the candidate into the solution.
  2. Recursively call the backtracking algorithm until either a solution is found.
  3. Or rollback the last move, and try another candidate.

We should pay attention to the following details:

Graph

The graph, G=(V,E)G = (V, E) consists a set of vertices V together with a set of E of edges. It is natural to model the real-world relationship with graph, the graph theory also help to explore the solution space implicitly. In practice, it is more important to recognize and categorize a unknown problem to an known graph algorithm

The data structures to store graph are:

Traversing a Graph

Traversing a graph is particularly interesting as some complicated problems can be modeled as whether there exists a path from the initial state to the expected state.

Breadth-first search, aka BFS takes the following steps to traverse a graph:

  1. Initialize the queue, Q: pick the vertex, v, mark it discovered, and enqueue it to Q.
  2. Dequeue the vertex, v from Q, process the vertex, and enqueue all its adjacent vertex which has not been marked as discovered.
  3. Repeat step 2, until the Q is empty.

Depth-first search, aka DFS takes the following steps:

  1. Given the vertex, v, iterate all its adjacent vertices, x
  2. Visit x, mark it visited, and recursively invoke DFS on x

In practice, DFS is preferred if the ancestor-descendant matters. The BFS MAY require more memory(see backtracking), while the DFS incurs more recursion overhead.

Graph Problems: Polynomial-Time

We study the characteristic of the graph itself to have a better understanding of the network, city traffic. And here are some examples:

Connectivity

Are vertices in the graph are connected? If not, how are they clustered as components?

  1. Mark all vertices with component = 0, initialize i = 1.
  2. Pick one vertex, v with component == 0, mark all its connected vertices with component = i.
  3. Bump the value of i, redo step 2 until there exist no vertex with component == 0.

Does this graph have articulation vertex for single point of failure?

We can brute-force the problem by removing every vertex from the graph, then check the graph’s connectivity. Or we can use the following linear-complexity algorithm.

  1. DFS traverse the graph to construct the DFS tree.
  2. If the root has more than two children, the root is an articulation vertex.
  3. If the vertex u has a child v with no back edge from v’s subtree, u is an articulation vertex.

To validate the case 2, we keep the access counter to track the ancestor(lower) and descendant(higher) relationship in the pre-order traverse, uses an extra low to track the lowest access counter reachable from v with back edge in the post-order traverse. Thus the case 2 is interpreted as v.low >= u.access_counter. The pre-order and post-order traverse can be combined as one in the DFS.

Minimum Spanning Trees

The minimum spanning trees connected all vertices in most inexpensive way, used in the network routing. The Prim’s algorithm is greedy:

  1. Pick an arbitrary vertex, v as the start point of the tree.
  2. Pick the shortest edge between tree and non-tree vertex
  3. Add the selected edge and vertex to the tree. Repeat step 2 until all vertices are included in the tree.

So it is with Kruskal’s algorithms:

  1. Sort the edge in the ascending order, pick n1n - 1 edges
  2. For each edge, if the edge does not belong to the same component, add to the tree, and merge the components Shortest Paths

Shortest Path

If there exists more than one path from point A to point B, what is the shortest path? The Dijkstra’s shortest path algorithm works in the dynamic programming fashion:

Here is the algorithm:

  1. For any non-visited vertex x, there exists a visited vertex, v that can minimize the dist[x], then we update dist[x] = min(dist[x], dist[v] + w[v, x])
  2. Add the vertex x to the known vertices.
  3. Repeat the above steps until we reach t.

Graph Problems: Hard Problems

These problems look intuitive, because the human being digest the graph in a different approach, but they are NP-complete hard. We may deal them with combinatorial search, heuristics, approximation algorithms, or algorithms for restricted instances.

Numerical Problems

Random Number Generation

We should avoid messing up the random number generation in general. A pseudo-random algorithm is linear congruential generator:

Rn=(αRn1+c)modmR*n = (\alpha R*{n-1} + c) \mod m

Combinatorial Problems

The combinatorial problems deal with the combinatorial objects, such as permutations, partitions, subsets, calendars, and schedules.

Median and Selection

The sorting and searching has been discussed above. The median and selection problem can be simplified by sorting the unordered sequence if O(nlogn)O(n\log{n}) is accepted. Otherwise, we can apply the similar algorithm as quicksort:

The complexity of the above algorithm is O(n)O(n).

For the data stream, it is impossible to store the massive data to find the median. We can sample the data by flipping a coin, and likely the sample will be close to the underlying data.

Set and String Problems

We will discuss Suffix tree in the future.

Computational Geometry

These problems arose from the computer graphics, CAD / CAM, such as:

Probability Programming

TBD