Week 3 - Maximum Flow and Minimum Cut & String Sorts #
Maximum Flow and Minimum Cut #
Problems to Solve #
Minimum Cut Problem
A cut is a node partition (S, T) such that s is in S and t is in T.
- capacity(S, T) = sum of weights of edges leaving S.
Find an s-t cut of minimum capacity.
An example of not being the minimum:
The correct answer
Max Flow Problem
- Assign flow to edges so as to:
- Maximize flow sent from s to t.
- Equalize inflow and outflow at every intermediate vertex.
- Assign flow to edges so as to:
Maxflow Mincut Theorem #
- Some terminologies
Original graph
- A graph keeps the capacity from each edges and the flow go through them.
Residual graph & Residual edge
- A parallel graph as the original graph which keeps two capacities: the forward capacity(=original capacity - flow); the backward capacity(=flow).
- The backward capacity is used for “undo” flow went through the edge.
- A parallel graph as the original graph which keeps two capacities: the forward capacity(=original capacity - flow); the backward capacity(=flow).
Augmenting paths
- path in residual graph, either:
- increase flow along forward edges.
- decrease flow along backward edges(not empty).
- The decrease on the backward edge is like re-distribute the flow to other place since the new flow we’re trying to create add the same amount.
- In the case above, 5 was deducted and added to the top edge, and pass long to t.
- increase flow along forward edges.
- Key point. Augmenting path in original network is equivalent to directed path in residual network.
- path in residual graph, either:
Relation between Maximum Flow and Minimum Cut #
- Def. The net flow across a cut (A, B) is the sum of the flows on its edges from A to B minus the sum of the flows on its edges from from B to A.
- Theorems:
- Flow-value lemma. Let f be a flow, and let (S, T) be any s-t cut. Then, the net flow sent across the cut is equal to the amount reaching t.
- Cut capacity always >= Flow value
- Weak duality. Let f be any flow and let (S, T) be any cut. Then, the value of the flow ≤ the capacity of the cut.
- Value of flow f = net flow across cut (S, T) ≤ capacity of cut (S, T).
- Flow-value lemma. Let f be a flow, and let (S, T) be any s-t cut. Then, the net flow sent across the cut is equal to the amount reaching t.
Max Flow and Min Cut #
Let f be a flow, and let (S, T) be an s-t cut whose capacity equals the value of f. Then f is a max flow and (S, T) is a min cut.
- Re-phrase:
- A flow f is a maxflow iff no augmenting paths. <– Augmenting Path Theorem.
- And, the value of max flow equals capacity of min cut. <– Max-Flow Min-Cut Theorem (Ford-Fulkerson, 1956).
- Re-phrase:
In my understanding, the capacity of the min-cut (S, T) is the bottleneck of the graph, which means it’s the maximum flow can be passed from S to T. Since the flow is equivalent, it must applies to s and t as well.
- and since max-flow already matched the bottleneck, there will be no new augmenting path can increase the flow to break the bottleneck.
- e.g.
- NOTE: Question: is it possible that the max flow didn’t reach the capacity of all cuts?
- I think it’s possible but it’s still the min cut.
To compute mincut (S, T) from maxflow f :
- By augmenting path theorem, no augmenting paths with respect to f(the maxflow).
- Compute S = set of vertices connected to s by an undirected path with no full forward or empty backward edges.
- TODO: Why? What if there are more than two edges from
s
have no full forward edges? what if none forward edges?
Ford-Fulkerson algorithm #
To find the maximum flow (and min-cut), the algorithm repeatedly finds augmenting paths through the residual graph and augments the flow until no more augmenting paths can be found.
Steps:
while (there exists an augmenting path) { Find augmenting path P Compute bottleneck capacity of P Augment flow along P }
How to find an augmenting path? #
- Assume that all capacities are integers between 1 and U(which is true for lots of applications), the algorithm terminates in at most
V x U
.- The worst case:
- Every interation, it visit every vertices and only increase the flow by 1. So it needs 100 * 2 interations to finish.
- The worst case:
- To Avoid the worst case, we can choose different algorithms for choosing augmenting paths.
- Fewest number of arcs. (shortest path)
- Max bottleneck capacity. (fattest path)
- Shortest augmenting path.
- Easy to implement with BFS.
- Finds augmenting path with fewest number of arcs.
- Facts:
- At most $EV$ total augmenting paths.
- O($E^2V$) running time.
- Fattest augmenting path.
- Finds augmenting path whose bottleneck capacity is maximum.
- Delivers most amount of flow to sink.
- Solve using Dijkstra-style (PFS) algorithm.
- Fact.
- O($E\log{U}$) augmentations if capacities are between 1 and U.
Java API #
Flow Edge API
public class FlowEdge { private final int v, w; // from and to private final double capacity; // capacity private double flow; // flow int FlowEdge(int v, int w, double capacity) { // create a flow edge v→w this.v = v; this.w = w; this.capacity = capacity; } int from() { return v; } // vertex this edge points from int to() { return w; } // vertex this edge points to int other(int v) { return v; } // other endpoint double capacity() { return capacity; }// capacity of this edge double flow() { return flow; }// flow in this edge double residualCapacityTo(int v) { // residual capacity toward v if (vertex == v) return flow; // backward edge else if (vertex == w) return capacity - flow; // forward edge else throw new IllegalArgumentException(); } void addResidualFlowTo(int v, double delta) { // add delta flow toward v if (vertex == v) flow -= delta; else if (vertex == w) flow += delta; else throw new IllegalArgumentException(); } String toString() // string representation }
Flow Network API
public class FlowNetwork { // same as EdgeWeightedGraph, but adjacency lists of FlowEdges instead of Edges private final int V; // number of vertices private Bag<FlowEdge>[] adj; public FlowNetwork(int V) { this.V = V; adj = (Bag<FlowEdge>[]) new Bag[V]; for (int v = 0; v < V; v++) adj[v] = new Bag<FlowEdge>(); } public void addEdge(FlowEdge e) { int v = e.from(); int w = e.to(); // add forward edge add backward edge adj[v].add(e); adj[w].add(e); } public Iterable<FlowEdge> adj(int v) { return adj[v]; } }
Ford-Fulkerson: Java implementation
public class FordFulkerson { private boolean[] marked; // true if s->v path in residual network private FlowEdge[] edgeTo; // last edge on s->v path private double value; // value of flow public FordFulkerson(FlowNetwork G, int s, int t) { value = 0.0; // method hasAugmentingPath generates the shortest path in edgeTo while (hasAugmentingPath(G, s, t)) { double bottle = Double.POSITIVE_INFINITY; // it's a little confused here since we're comparing both // the backward and forward edges all togather. // Think about it this way: // Assume we find the the bottlenect capacity is 1, // then we will decrease the capacity of this backward edge by 1 // and 1 to both its upstream and downstream. // The total flow is increased by 1 as well. for (int v = t; v != s; v = edgeTo[v].other(v)) // compute bottleneck capacity bottle = Math.min(bottle, edgeTo[v].residualCapacityTo(v)); for (int v = t; v != s; v = edgeTo[v].other(v)) // augment flow edgeTo[v].addResidualFlowTo(v, bottle); value += bottle; } } // BFS to find the shortest path in the residual network // BE AWARE this algorithm saves the last edge on path to w, // because it's BFS, we store the last valid path to the target in edgeTo. private boolean hasAugmentingPath(FlowNetwork G, int s, int t) { edgeTo = new FlowEdge[G.V()]; marked = new boolean[G.V()]; Queue<Integer> queue = new Queue<Integer>(); queue.enqueue(s); marked[s] = true; while (!queue.isEmpty()) { int v = queue.dequeue(); for (FlowEdge e : G.adj(v)) { int w = e.other(v); // found path from s to w in the residual network? if (e.residualCapacityTo(w) > 0 && !marked[w]) { edgeTo[w] = e; // save last edge on path to w; marked[w] = true; // mark w; queue.enqueue(w); // add w to the queue } } } return marked[t]; // is t reachable from s in residual network? } public double value() { return value; } // is v reachable from s in residual network? public boolean inCut(int v) { return marked[v]; } }
Additional Resources #
- https://www.cs.princeton.edu/courses/archive/spr04/cos226/lectures/maxflow.4up.pdf
- Note: slides 16 and 17 have a mistake. The flow on edges(s-4 and 4-7) should both be 14.
- https://www.youtube.com/watch?v=LdOnanfc5TM
String Sorts #
Strings in Java #
String Data Type: Sequence of characters (immutable)
StringBuilder data type. Sequence of characters (mutable/thread safe/slower).
String vs. StringBuilder
Quadratic time
public static String reverse(String s) { String rev = ""; for (int i = s.length() - 1; i >= 0; i--) rev += s.charAt(i); return rev; }
Linear time
public static String reverse(String s) { StringBuilder rev = new StringBuilder(); for (int i = s.length() - 1; i >= 0; i--) rev.append(s.charAt(i)); return rev.toString(); }
Alphabets #
- Digital key. Sequence of digits over fixed alphabet.
- Radix. Number of digits R in alphabet.
key-indexed counting #
Assumption. Keys are integers between 0 and R - 1.
Goal. Sort an array a[] of N integers between 0 and R - 1.
Implementation.
int N = a.length; // EXTENDED_ASCII 256 8 extended ASCII characters // R = 256 int[] count = new int[R+1]; // 1. Count frequencies of each letter using (key+1) as index. // e.g. a is 0 and the index is 1. for (int i = 0; i < N; i++) count[a[i]+1]++; // 2. Compute frequency cumulates which specify destinations. // after the accumulation, we will be able to tell things like: // 6 keys < d, 8 keys < e so d’s go in a[6] and a[7] for (int r = 0; r < R; r++) count[r+1] += count[r]; // 3. Access cumulates using key as index to move items. // e.g. a[0] = d, count[d] = 8, so aux[8] = d, then accumulate d's index, so next d can go to 9 // after the loop, all of the indices will be accumulated to their upper bound, which is why you see f and - are both 12. for (int i = 0; i < N; i++) aux[count[a[i]]++] = a[i]; // 4. Copy back into original array. for (int i = 0; i < N; i++) a[i] = aux[i];
- Notice: use a for 0, b for 1, c for 2, d for 3, e for 4, f for 5 for better explanation
Proposition. Key-indexed counting uses ~ 11 N + 4 R array accesses to sort N items whose keys are integers between 0 and R - 1.
LSD radix sort #
- Least-significant-digit-first string sort
- LSD string (radix) sort.
- Consider characters from right to left.
- Stably sort using d^th character as the key (using key-indexed counting).
MSD radix sort #
Most-significant-digit-first string sort
- Partition array into R pieces according to first character (use key-indexed counting).
- Recursively sort all strings that start with each character (key-indexed counts delineate subarrays to sort).
Treat strings as if they had an extra char at end (smaller than any char).
implementation
potential for disastrous performance
- much too slow for small subarrays
- Each recursion needs a new
count[]
array, if using ASCII(256 counts): 100x slower than copy pass forN=2
.
- Each recursion needs a new
- Huge number of small subarrays because of recursion.
- Solution: Cutoff to insertion sort for small subarrays.
- Insertion sort, but start at $d^{th}$ th character.
- Implement less() so that it compares starting at $d^{th}$ character.
- much too slow for small subarrays
3-way radix quicksort #
Do 3-way partitioning on the $d^{th}$ character.
faster than the other two algorithms, but not stable.
Summary of the performance of sorting algorithms #
Suffix Arrays #
- Keyword-in-context search
- Given a text of N characters, preprocess it to enable fast substring search (find all occurrences of query string context).
Suffix sort #
Longest repeated substring
- Given a string of N characters, find the longest repeated substring.
worst-case input: longest repeated substring very long.
- LRS needs at least 1 + 2 + 3 + … + D character compares, where D = length of longest match.
Manber-Myers algorithm #
- Suffix sorting in linearithmic time
- linear: N, linearithmic: a * N
- overview
- Phase 0: sort on first character using key-indexed counting sort.
- Phase i: given array of suffixes sorted on first 2 i - 1 characters, create array of suffixes sorted on first 2 i characters.
- Key process: Constant-time string compare by indexing into inverse
- inverse[]:
- before: {0: 17, 1: 16, 2: 15, …, 9: 1, …, 12: 2, …, 14: 0 }
- after: { 0: 14, 1: 9, 2: 12, …}
- inverse[]: