Week 3 - Maximum Flow and Minimum Cut & String Sorts

Week 3 - Maximum Flow and Minimum Cut & String Sorts #

Maximum Flow and Minimum Cut #

Problems to Solve #

  • Minimum Cut Problem

    • A cut is a node partition (S, T) such that s is in S and t is in T.

      • capacity(S, T) = sum of weights of edges leaving S.
    • Find an s-t cut of minimum capacity.

    • An example of not being the minimum:

    • The correct answer

  • Max Flow Problem

    • Assign flow to edges so as to:
      • Maximize flow sent from s to t.
      • Equalize inflow and outflow at every intermediate vertex.

Maxflow Mincut Theorem #

  • Some terminologies
    • Original graph

      • A graph keeps the capacity from each edges and the flow go through them.
    • Residual graph & Residual edge

      • A parallel graph as the original graph which keeps two capacities: the forward capacity(=original capacity - flow); the backward capacity(=flow).
        • The backward capacity is used for “undo” flow went through the edge.
    • Augmenting paths

      • path in residual graph, either:
        • increase flow along forward edges.
        • decrease flow along backward edges(not empty).
          • The decrease on the backward edge is like re-distribute the flow to other place since the new flow we’re trying to create add the same amount.
          • In the case above, 5 was deducted and added to the top edge, and pass long to t.
      • Key point. Augmenting path in original network is equivalent to directed path in residual network.

Relation between Maximum Flow and Minimum Cut #

  • Def. The net flow across a cut (A, B) is the sum of the flows on its edges from A to B minus the sum of the flows on its edges from from B to A.
  • Theorems:
    1. Flow-value lemma. Let f be a flow, and let (S, T) be any s-t cut. Then, the net flow sent across the cut is equal to the amount reaching t.
      • Cut capacity always >= Flow value
    2. Weak duality. Let f be any flow and let (S, T) be any cut. Then, the value of the flow ≤ the capacity of the cut.
      • Value of flow f = net flow across cut (S, T) ≤ capacity of cut (S, T).

Max Flow and Min Cut #

  • Let f be a flow, and let (S, T) be an s-t cut whose capacity equals the value of f. Then f is a max flow and (S, T) is a min cut.

    • Re-phrase:
      • A flow f is a maxflow iff no augmenting paths. <– Augmenting Path Theorem.
      • And, the value of max flow equals capacity of min cut. <– Max-Flow Min-Cut Theorem (Ford-Fulkerson, 1956).
  • In my understanding, the capacity of the min-cut (S, T) is the bottleneck of the graph, which means it’s the maximum flow can be passed from S to T. Since the flow is equivalent, it must applies to s and t as well.

    • and since max-flow already matched the bottleneck, there will be no new augmenting path can increase the flow to break the bottleneck.
    • e.g.
    • NOTE: Question: is it possible that the max flow didn’t reach the capacity of all cuts?
      • I think it’s possible but it’s still the min cut.
  • To compute mincut (S, T) from maxflow f :

    • By augmenting path theorem, no augmenting paths with respect to f(the maxflow).
    • Compute S = set of vertices connected to s by an undirected path with no full forward or empty backward edges.
    • TODO: Why? What if there are more than two edges from s have no full forward edges? what if none forward edges?

Ford-Fulkerson algorithm #

To find the maximum flow (and min-cut), the algorithm repeatedly finds augmenting paths through the residual graph and augments the flow until no more augmenting paths can be found.

  • Steps:

    while (there exists an augmenting path) { 
        Find augmenting path P 
        Compute bottleneck capacity of P 
        Augment flow along P 
    }
    

How to find an augmenting path? #

  • Assume that all capacities are integers between 1 and U(which is true for lots of applications), the algorithm terminates in at most V x U.
    • The worst case:
    • Every interation, it visit every vertices and only increase the flow by 1. So it needs 100 * 2 interations to finish.
  • To Avoid the worst case, we can choose different algorithms for choosing augmenting paths.
    1. Fewest number of arcs. (shortest path)
    2. Max bottleneck capacity. (fattest path)
  • Shortest augmenting path.
    • Easy to implement with BFS.
    • Finds augmenting path with fewest number of arcs.
    • Facts:
      • At most $EV$ total augmenting paths.
      • O($E^2V$) running time.
  • Fattest augmenting path.
    • Finds augmenting path whose bottleneck capacity is maximum.
    • Delivers most amount of flow to sink.
    • Solve using Dijkstra-style (PFS) algorithm.
    • Fact.
      • O($E\log{U}$) augmentations if capacities are between 1 and U.

Java API #

  • Flow Edge API

    public class FlowEdge {
        private final int v, w; // from and to 
        private final double capacity; // capacity 
        private double flow; // flow
    
        int FlowEdge(int v, int w, double capacity) { // create a flow edge v→w
            this.v = v;
            this.w = w;
            this.capacity = capacity;
        }
        int from() { return v; } // vertex this edge points from
        int to() { return w; } // vertex this edge points to
        int other(int v) { return v; } // other endpoint
        double capacity() { return capacity; }// capacity of this edge 
        double flow() { return flow; }// flow in this edge 
    
        double residualCapacityTo(int v) { // residual capacity toward v 
            if (vertex == v) return flow; // backward edge
            else if (vertex == w) return capacity - flow; // forward edge
            else throw new IllegalArgumentException(); 
        }
    
        void addResidualFlowTo(int v, double delta) { // add delta flow toward v 
            if (vertex == v) flow -= delta;
            else if (vertex == w) flow += delta;
            else throw new IllegalArgumentException();
        }
    
        String toString() // string representation
    }
    
  • Flow Network API

    public class FlowNetwork {
        // same as EdgeWeightedGraph, but adjacency lists of FlowEdges instead of Edges
        private final int V; // number of vertices
        private Bag<FlowEdge>[] adj;
    
        public FlowNetwork(int V) { 
            this.V = V; 
            adj = (Bag<FlowEdge>[]) new Bag[V]; 
            for (int v = 0; v < V; v++) 
                adj[v] = new Bag<FlowEdge>(); 
        }
    
        public void addEdge(FlowEdge e) { 
            int v = e.from(); 
            int w = e.to(); 
            //  add forward edge add backward edge
            adj[v].add(e); 
            adj[w].add(e); 
        }
    
        public Iterable<FlowEdge> adj(int v) { 
            return adj[v]; 
        }
    }
    
  • Ford-Fulkerson: Java implementation

    public class FordFulkerson {
        private boolean[] marked; // true if s->v path in residual network
        private FlowEdge[] edgeTo; // last edge on s->v path
        private double value; // value of flow
    
        public FordFulkerson(FlowNetwork G, int s, int t) {
            value = 0.0; 
    
            // method hasAugmentingPath generates the shortest path in edgeTo
            while (hasAugmentingPath(G, s, t)) { 
                double bottle = Double.POSITIVE_INFINITY;
                // it's a little confused here since we're comparing both 
                // the backward and forward edges all togather.
                // Think about it this way:
                //    Assume we find the the bottlenect capacity is 1, 
                //    then we will decrease the capacity of this backward edge by 1 
                //    and 1 to both its upstream and downstream. 
                //    The total flow is increased by 1 as well.
                for (int v = t; v != s; v = edgeTo[v].other(v)) // compute bottleneck capacity
                    bottle = Math.min(bottle, edgeTo[v].residualCapacityTo(v));
    
                for (int v = t; v != s; v = edgeTo[v].other(v)) // augment flow
                    edgeTo[v].addResidualFlowTo(v, bottle);
    
                value += bottle;
            }
        }
    
        // BFS to find the shortest path in the residual network
        // BE AWARE this algorithm saves the last edge on path to w,
        // because it's BFS, we store the last valid path to the target in edgeTo.
        private boolean hasAugmentingPath(FlowNetwork G, int s, int t) {
            edgeTo = new FlowEdge[G.V()]; 
            marked = new boolean[G.V()];
    
            Queue<Integer> queue = new Queue<Integer>(); 
            queue.enqueue(s); 
            marked[s] = true; 
            while (!queue.isEmpty()) { 
                int v = queue.dequeue();
                for (FlowEdge e : G.adj(v)) { 
                    int w = e.other(v); 
                    // found path from s to w in the residual network?
                    if (e.residualCapacityTo(w) > 0 && !marked[w]) { 
                        edgeTo[w] = e; // save last edge on path to w; 
                        marked[w] = true; // mark w; 
                        queue.enqueue(w); // add w to the queue 
                    }
                }
            }
            return marked[t]; // is t reachable from s in residual network?
        }
    
        public double value() { return value; }
    
        // is v reachable from s in residual network?
        public boolean inCut(int v) { return marked[v]; }
    }
    

Additional Resources #

String Sorts #

Strings in Java #

  • String Data Type: Sequence of characters (immutable)

  • StringBuilder data type. Sequence of characters (mutable/thread safe/slower).

  • String vs. StringBuilder

    • Quadratic time

      public static String reverse(String s) { 
          String rev = ""; 
          for (int i = s.length() - 1; i >= 0; i--) 
              rev += s.charAt(i); 
          return rev; 
      }
      
    • Linear time

      public static String reverse(String s) { 
          StringBuilder rev = new StringBuilder(); 
          for (int i = s.length() - 1; i >= 0; i--) 
              rev.append(s.charAt(i)); 
          return rev.toString(); 
      }
      

Alphabets #

  • Digital key. Sequence of digits over fixed alphabet.
  • Radix. Number of digits R in alphabet.

key-indexed counting #

  • Assumption. Keys are integers between 0 and R - 1.

  • Goal. Sort an array a[] of N integers between 0 and R - 1.

  • Implementation.

    int N = a.length; 
    // EXTENDED_ASCII 256 8 extended ASCII characters
    // R = 256
    int[] count = new int[R+1];
    
    // 1. Count frequencies of each letter using (key+1) as index.
    // e.g. a is 0 and the index is 1.
    for (int i = 0; i < N; i++) 
        count[a[i]+1]++;
    
    // 2. Compute frequency cumulates which specify destinations.
    // after the accumulation, we will be able to tell things like:
    //     6 keys < d, 8 keys < e so d’s go in a[6] and a[7]
    for (int r = 0; r < R; r++) 
        count[r+1] += count[r];
    
    // 3. Access cumulates using key as index to move items.
    // e.g. a[0] = d, count[d] = 8, so aux[8] = d, then accumulate d's index, so next d can go to 9
    // after the loop, all of the indices will be accumulated to their upper bound, which is why you see f and - are both 12.
    for (int i = 0; i < N; i++) 
        aux[count[a[i]]++] = a[i];
    
    // 4. Copy back into original array.
    for (int i = 0; i < N; i++) 
        a[i] = aux[i];
    
    • Notice: use a for 0, b for 1, c for 2, d for 3, e for 4, f for 5 for better explanation
  • Proposition. Key-indexed counting uses ~ 11 N + 4 R array accesses to sort N items whose keys are integers between 0 and R - 1.

LSD radix sort #

  • Least-significant-digit-first string sort
  • LSD string (radix) sort.
    • Consider characters from right to left.
    • Stably sort using d^th character as the key (using key-indexed counting).

MSD radix sort #

  • Most-significant-digit-first string sort

    • Partition array into R pieces according to first character (use key-indexed counting).
    • Recursively sort all strings that start with each character (key-indexed counts delineate subarrays to sort).
  • Treat strings as if they had an extra char at end (smaller than any char).

  • implementation

  • potential for disastrous performance

    1. much too slow for small subarrays
      1. Each recursion needs a new count[] array, if using ASCII(256 counts): 100x slower than copy pass for N=2.
    2. Huge number of small subarrays because of recursion.
    • Solution: Cutoff to insertion sort for small subarrays.
      • Insertion sort, but start at $d^{th}$ th character.
      • Implement less() so that it compares starting at $d^{th}$ character.

3-way radix quicksort #

  • Do 3-way partitioning on the $d^{th}$ character.

  • faster than the other two algorithms, but not stable.

Summary of the performance of sorting algorithms #

Suffix Arrays #

  • Keyword-in-context search
    • Given a text of N characters, preprocess it to enable fast substring search (find all occurrences of query string context).

Suffix sort #

  • Longest repeated substring

    • Given a string of N characters, find the longest repeated substring.
  • worst-case input: longest repeated substring very long.

    • LRS needs at least 1 + 2 + 3 + … + D character compares, where D = length of longest match.

Manber-Myers algorithm #

  • Suffix sorting in linearithmic time
    • linear: N, linearithmic: a * N
  • overview
    • Phase 0: sort on first character using key-indexed counting sort.
    • Phase i: given array of suffixes sorted on first 2 i - 1 characters, create array of suffixes sorted on first 2 i characters.
  • Key process: Constant-time string compare by indexing into inverse
    • inverse[]:
      • before: {0: 17, 1: 16, 2: 15, …, 9: 1, …, 12: 2, …, 14: 0 }
      • after: { 0: 14, 1: 9, 2: 12, …}