Advanced Algorithm

Introduction Take notes for the advanced algorithm and write down the knowledge points in each chapter

Advanced Algorithm

Course preview

Lecture schedule

Summary

Four old problems

  • primarily testing: determine if a given integer is prime
  • dictionary problem: design an efficient data structure that support find, insert, and delete operations
  • minimum spanning trees
  • string matching: find all occurrences of a pattern string p in a text string t

Two new tools

  • randomized algorithms: have access to a random source
  • amortized analysis: worst-case analysis of a sequence of operations

Primarily Testing

Problem defination

brute-force solution

Q & A:

  1. why the loop is from 2 to √n
    A prime number is a number that is not divisible by any number other than itself and 1. Therefore, we start from 2. And this loop is to check whether there is a pair p * q = n = √n * √n, therefore, the smaller one in p and q musct smaller or equal than √n. So the iteration from 2 to √n will cover all possibilities.

  2. why √N = 2(lg(N)/2)
    √N = 2(lg(√N)) = 2(lg (N1/2)) = 2(1/2*lg(N))

  3. why (√2)lgN = √2n
    n is the binary bit of this N

N lgN + 1 n
1 lg1 + 1 1
2 lg2 + 1 2
3 lg3 + 1 2
4 lg4 + 1 3
5 lg5 + 1 3

And let’s think of lgN + 1 as lgN, we dismiss plus 1 here where N is big enough.

  1. why this algorithm is exponential?
    the running time is θ((√2)n), n is the bit of value N, and we look n here is input size. Therefore, the running time is exponential.

Dictionary Problem

Problem defination

O(1) Time & O(n) Space solution

BST solution


note
BST: binary serach tree, quite slow in reality.
2-3 Tree: a tree whose nodes can have two or three children nodes. Hard to implement.

New solution

  • splay tree: amortized BST
    When a node x is accessed, a splay operation is performed on x to move it to the root. To perform a splay operation we carry out a sequence of splay steps, each of which moves x closer to the root. By performing a splay operation on the node of interest after every access, the recently accessed nodes are kept near the root and the tree remains roughly balanced, so that we achieve the desired amortized time bounds.
  • treaps: randomized BST
    Treap = Tree + Heap. Treap itself is a binary search tree, its left subtree and the right subtree are also a Treap, and the common binary search tree is different, Treap records an additional data, is the priority. Treap not only forms a binary search tree with key codes, but also satisfies the property of heap. Treap’s approach to maintaining heap properties uses rotations, which require only two rotations and are less programming complicated than Splay’s.
  • skip lists: randomized linked lists
    normal linked list: insert & delete O(1), checkO(n)
    Skiplist uses the idea of space for time

    use binary search improve search efficiency.
    the blew one is original linked list, the middle one is original list divided into four parts, the top one is original list divided into two parts. And if we want to search 7.
    the original search range is (H, T)
  1. in the top list, 7 > 4, search range becomes (4, T)
  2. in the middle list, 7 > 6, search range becomes (6, 7)
  3. in the below list, 7 == 7, search success!

In general, Skiplist stores additional intermediate information for binary lookups

Minimum Spanning Trees

Problem defination


connect all nodes, no cycle, minimum weight

Kruskal algorithm

Algorithm introduction

sort weight ascendingly, iterate every edge, if the nodes on the edge is in different group, use this edge and let one node join the other.

Prim algorithm

Algorithm introduction

  1. Add one node to the set V

  2. while(not all nodes in set V) {

    • find the least weighted edge from the nodes in the set V to the nodes outside the set
    • then add this edge and this outer node to the set V
      }

Disjoint set

alias: Union-find

Fibonacci heap

A Fibonacci heap is a collection of trees satisfying the minimum-heap property, that is, the key of a child is always greater than or equal to the key of the parent. This implies that the minimum key is always at the root of one of the trees.

String Matching

Problem defination

Classic solutions

  • variations of brute-force algorithm, θ(mn);
  • KMP(Knuth-Morris-Pratt) algorith, first θ(m + n);
  • Boyer-Moore is the fastest in practice;
  • Karp-Rabin runs in expected θ(n+m);
  • Z algorithm a new preprocessing technique that underlies both KMP and BM;
  • These modifications preprocess the pattern;

Generating Random Sequences Of Integers

Desirable properites of rand(n)

  • not truly random, but only appears random according to various statistical tests(such as equal number of 0, …, n - 1 in a long generated sequence)
  • efficiently computable and determinstic(reproducible)

Computing ∏ using rand()

Implementation

Sample run

Randomized Algorithm

Review of probability theory

Random Variables

Expected Value

Linearity Of Expected Values

Indicator Random Variables

Example 1

Example 2

Simpler Solution using linearlity of expected values

Binomial Random Variable Bn,p

An Infinite Sum

note:
since 1/(1-a) = 1 + a + a2 + a3 + …
(1/(1/a))2 = (1 + a + a2 + a3 + …)2
= 1 + a + a2 + a3 + …

  • 0 + a + a2 + a3 + …
  • 0 + 0 + a2 + a3 + …
  • 0 + 0 + 0 + a3 + …

  • = 1 + 2a + 3a2 + 4a3 + …
    = ∑(i=1-∞)iai-1

Geometric random variables Gp

Q&A
why p∑i(1-p)i-1 = p / (1 - (1 - p))2
because theorem ∑iai-1 = 1 / (1-a)2 a ≠ 1

Harmonic Numbers

Q&A:

  1. Why Hn ≥ ln(n+1)
    use the upper bound to calculate rectangle area
    area = 1 + 1/2 + 1/3 + … + 1/(n-1) = Hn-1
    And ln(n) is under area of this curve, since (ln(n))’ = 1/n
    therefore, Hn-1 ≥ ln(n), then replace n with n+1, we get Hn ≥ ln(n+1)
  2. Why Hn ≤ ln(n) + 1
    use the lower bound to calculate rectangle area
    area = 1/2 + 1/3 + … + 1/n = Hn - 1
    And ln(n) is under area of this curve, since (ln(n))’ = 1/n
    therefore, Hn - 1 ≤ ln(n), Hn ≤ ln(n) + 1

Randomized Searching

Problem defination

Deterministic Solution

Solution Analysis

Randomized Solution

Solution Analysis

Average time

T is the iteration times
when target is in array

when target is not in array

Example-Coupon Collection

Randomized Sorting

Problem defination

Deterministic Quicksort

Solution Analysis


Average case
the partition are evenly, the depth of recursive tree is log(n) + 2

T(n)≤2T(n/2) +n,T(1)=0  (you need to compare all elements to find the right position of pivot)
T(n)≤2(2T(n/4)+n/2) +n=4T(n/4)+2n  
T(n)≤4(2T(n/8)+n/4) +2n=8T(n/8)+3n  
……  
T(n)≤nT(1)+(log2n)×n= O(nlogn)  (every level compare n times total, logn level compare nlogn times)

therefore, the running time is O(nlogn).

Worst case
the input array is totally ascending or descending, every partition just reduce one element, that is, its left or right is empty. So it need n iterations, and in each iteration the pivot needs to compare with all other elements. The running time is O(n2)

each input permutation is equally likely
the array is input randomly. Every input order has same probability.

Randomized Quicksort

Expected running time analysis

Notation

For a fixed input array A[0..n-1]:

  • Let C be the random variable counting the number comparisons made by randomized quicksort.
  • Let ai denote the i th smallest element of A[0..n-1], ai is not the element at position i.

When is Ci,j is 1?

P(Ci,j = 1) = 2/(j-i+1)
since there are j - i + 1 elements in ai…aj. Ci,j == 1 when Ci or Cj is pivot,. And each element has the same probability to become pivot, therefore, P(Ci,j == 1) = 2/(j-i+1).

Using linearity of expection to compute E[C]


Randomized Selection

Problem defination

Deterministic QuickSelect

Solution Analysis

Randomized QuickSelect

Solution Analysis

Assumption:
For a fixed input array A of size n and a fixed rank k
Let C be the random variable counting the number of comparisons.
Let ai be the ith smallest value in A.
C = ∑1<=i<=j<=nCi,j, where
Ci = 1 if ai and aj is compared
E[C] = E[∑1<=i<=j<=nCi,j] = ∑E[Ci,j] = ∑P(Ci,j = 1)

when Ci,j = 1?

i is always less than j

  1. k <= i < j
    Let ap be the next pivot:
    • p < k or p > j, ai and aj and k will belong to the same partition.
    • k <= p < i, ai and aj will never be compared, Ci,j = 0.
    • i < p < j, ai and aj will belong to different partition, they will never be compared, Ci,j = 0.
    • p = i or p = j, pivot will be compared with every other elements, Ci,j = 1.

Since each element in the set(ak, ak+1, ak+2, …… aj) has equal chance of being selected as the pivot.
P(Ci,j=1) = 2 / (j-k+1)

summing over all k <=i < j <= n

  1. i < j <= k
    Let ap be the next pivot:
    • p < i or p > k, ai and aj and k will belong to the same partition.
    • j < p <= k, ai and aj are discarded and Ci,j = 0.
    • i < p < j: ai and aj will never be compared and Ci,j = 0.
    • p = i or p = j, Ci,j = 1.

Since each element in the set(ai, ai+1, ai+2, …… ak) has equal chance of being selected as the pivot.
P(Ci,j=1) = 2 / (k-i+1)

summing over all 1 <=i < j <= k

  1. i < k < j
    Let ap be the next pivot:
    • p < i or p > j, ai and aj and k will belong to the same partition.
    • i < p < j, ai and aj will never be compared, Ci,j = 0.
    • p = i or p = j, Ci,j = 1.

Since each element in the set(ai, ai+1, ai+2, …… aj) has equal chance of being selected as the pivot.
P(Ci,j=1) = 2 / (j-i+1)

summing over all 1 <= i < k < j <= n


Q&A:

  1. why the range of s is from max{1, k-d+1} to min{k-1, n-1-d}?
    s is from max{1, k-d+1}. Because the minimize distance between k and j is 1, the distance between i and j is d, therefore the distance between i and k is max (k-(d-1)) = k-d+1. At the same time, the start must be positive. So the start is from max{1, k-d+1}.
    s is to {k-1, n-1-d}. Because the maximize j is n-1, and the distance between i and j is d, therefore, the maximize i(start) is n-1-d. At the same time, i is smaller than k. So the start is to min{k-1, n-1-d}.

  2. What is the purpose of the segmentation in second row?
    Divide d into two parts according to whether it is greater than k.
    In the first part, because d is smaller or equal to k, therefore k-d+1 must be postive, and i(start) must smaller than k, therefore the range is from k-d+1 to k-1.
    In the second part, because d is larger than k, therefore k-d+1 must be negative, so we choose 1 be the start, and from the same reason the maximize start is k-1, so the range is from 1 to k-1.

  3. Why we can change k to d directly in row 3 to row 4?
    In the second part, d is bigger than k. So if the fourth row is less than the third row, we can just replace k with d.

  4. Why ∑(2(d-1)/(d+1))+∑(2(d-1)/(d+1)) < 2(k-2)+2(n-k)?

E[C] < θ(n)

adding all situations

Esxactly Formula Of E[C]

Primality Testing

Problem Defination

Upper Bound


Randomized Solution:Fermat test

Theorem

Theorem Proof

Q&A

  1. Why {1a, 2a, 3a, … (p-1)a} = {1, 2, 3, …, p-1} mod p?
    For any prime p and any a smaller than p, the left is the same set with right.
    For example p = 5, a = 3.
    left={1,2,3,4}, right = {3,1,4,2}.
  2. What does that mean if ab = 0 (mod p)?
    it means p%a = 0 or p%b = 0.
  3. Why (1a)(2a)…(p-1)a = ((1)(2)…(p-1))(mod p)?
    Since the left and right is the same set, so the product of all elements is equal. That is, the product of {1,2,3,4} is equal to the product of {3,1,4,2}.

New Idea

Implementation


Error Situation

When N is 3*5 = 15, the a is 4, the answer is incorrect.

Error Probability

fermat(N) errors only if N is composite.
ϕ(N) = {0<a<N, gcd(a,N)=1}

  • If N has one Fermat witness a that it is composite, then there must be a lot witnesses they are composited and the error probablity is at most 1/2.
    That is because If there is one witness a and other nonwitness set {b1, b2, b2, b3, …}, there must be a witness set {ab1, ab2, ab2, ab3, …}, So there are at least 1/2 witness, the error probability at most is 1/2.
  • If N has no Fermat witness, then the error rate is at least ϕ(N)/N -> 1/N -> ∞, where ϕ(N) is the number of positive integers smaller than and coprime with N.

Reduce Error Probability from 1/2 to insignificant

Carmichael numbers

Randomized Solution:Miller-Rabin

Theorem

Modified Theorem

Example

Implementation Idea



Q&A

  1. Why N-1 == -1?

-1 = (N-1) mode (N). Like 64 = -1 (mod 13) = 12 (mod 13)
Since mod just return positive result in C++, so we choose N-1 here.

Solution Analysis

Error Situation


Example

Error Probability

Example

Reduce Error Probability from 1/4 to insignificant

Amortized Analysis

What Is Amortized Analysis

A Toy Example: The K-bit Binary Counter

Counter Data Structure

Traditional Analysis

The worst running time of one increment is k bit flips.  
The worst running time of m increments is km bit flips.  
However, from real running in reality, the running time of m increments is 2m bit flips.

How to get the correct result?

Amortized Analysis

Aggregate counting


Generalizing: bit i flip once evert 2i increment’s

Funny accounting



Why amortized cost >= acutal cost
That is because reset operations occurs after set operations.

Potential function

How to pick φ?


Note
Ci = Cset + Creset = 2 + 0 = 2.

Dynamic Stacks

Data Structure

Implementation

Definition

Constructor

Push

Pop

Traditional Analysis


The worst running time of one operation is linear of array size
The worst running time of m operations is ∑(i= 1 to m) i = φ(m2)
However, from real running in reality, the running time of m operations is linear in array size

Amortized Analysis

Example: m pushes
Aggregate counting


Q&A

  1. Why we use lg(m) rather than lg(m-1) here, Two to the power doesn’t need any new expansion?
    Because we calculate upper bound here. It just use lg(m) for convenience.

  2. ∑(k=1 to m)1 + ∑(k=0 to ⌊lg(m)⌋) 2k = m + 21+⌊lg(m)⌋ - 20 < 3m?

    ∑(k=0 to ⌊lg(m)⌋) 2k = 1*(1-2lg(m) + 1)/(1-2) = 2lg(m) + 1 - 1.

    2lg(m) + 1 - 1 = 2*2lg(m) - 1= 2m - 1

Funny accounting

Potential function

φ here mean the change in the credits we have in ith push

Example: general case of m push/pop operations
Potential function

φ here mean the change in the credits we have in ith push/pop

  1. push without expansion
    push without expansion
  2. push with expansion
    push with expansion
  3. pop without contraction
    pop without contraction
  4. pop with contraction
    pop with contraction

UNIVERSAL HASHING

Dictionary Problem

Problem Definition

definition

Royal Solution: Use An Array Of Size NKEYS

Implementation

royal solution

Analysis

analysis
11.1-4
We wish to implement a dictionary by using direct addressing on a huge array. At the start, the array entries may contain garbage, and initializing the entire array is impractical because of its size. Describe a scheme for implementing a direct- address dictionary on a huge array. Each stored object should use O(1) space; the operations SEARCH, INSERT, and DELETE should take O(1) time each; and initializing the data structure should take O(1) time. (Hint: Use an additional array, treated somewhat like a stack whose size is the number of keys actually stored in the dictionary, to help determine whether a given entry in the huge array is valid or not.)

Compromise Solution: Hash Table

Implementation

hash table
Quadratic_probing
Double Hashing
Double hashing is a collision resolving technique in Open Addressed Hash tables. Double hashing uses the idea of applying a second hash function to key when a collision occurs

Worst-case Analysis

worst case

Average-Case Analysis: chain hashing

average case

2-Universal Property

2-Universal Property

Expect Running Time

Expect Running Time

Expect Running Time Of A SEQUENCE Of Hash Operation

Expect Running Time

A Universal Hash Function Family

Universal Property

Big Family

A Much Smaller Family With Universal Hashing Property

Small Family

Example

Small Family

Proof Of Universal Hashing Property

Small Family

  1. ak1+b ≠ ak2+b
    Same a,b. Different k, the equation is impossible same.
  2. ak1+b ≠ ak2+b mod P
    If ak1+b = ak2+b mod P
    Then a(k1-k2) = 0 mod P
    Since P is Prime, so a is 0 or k1=k2. However, a is bigger than 0, and k1 ≠ k2, So it is impossible that ak1+b = ak2+b mod P.
  3. ak1+b = (ak2+b mod P)mod m
    Assume ak1+b = r, ak2+b mod P = s.
    If r = s mod m, the difference between r and s is an integer multiple m.
    So there are p/m pairs. And since P is prime, p/m ≤ (p-1)/m. And r can be any numbr for 0 to P. So the total same pairs is p(p-1)/m.
  4. The collision probability 1/m.
    k1 and k2 has total p(p-1) pairs, so the collision probablity of each pair is p(p-1)/m/(p(p-1)) = 1/m.

Perfect Hashing

Perfect Definition

Given n keys, a hashing function is perfect if it maps each of the key to a unique position.

Markov’s inequality

Markov's Inequality

Perfect Hashing using quadratic space

quadratic
Why ∑ 1/m ≤ n2/m?
loop through all pairs (i,j) = n + (n-1) + (n-2) + … + 1 = (n+1)*n/2

Perfect Hashing using linear space

linear

First Step

first step

Space Analysis

space

Binary Search Tree

Extended Dictionary Problam

extend dict problam

Classical Solution: Binary Search Tree

Implementation

Class Definition

class

min & max function

min&max

find function

find

insert function

insert

delete_min function

delete_min

delete function

delete

predecessor function

predecessor

Example

Insert

insert example

delete

delete example

Analysis

Good average-case behavior: O(lgn) per operation.
Worst-case behavior: Ω(h), where h is the height, which could be n.

Balanced BST

  1. AVL trees
  2. 2-3 trees(aka red-black trees)
    The worst-case running time of each operation is O(lgn)

Basic Operation: rotation

rotation

2-3 Trees

Nodes

nodes
nodes

Insert

Always Insert At A Leaf
  1. use binary serach property to find appropriate leaf.
  2. if leaf is a 2-node, make it a 3-node with new value.
  3. if leaf is a 3-node, make it a 4-node and then recursively split and insert middle value into parent node.
Example: inserting 2-leaf

insert

Example: inserting 3-leaf

insert

Example: inserting 3-leaf recursively

insert

Example inserting

insert

Delete

Always Delete At A Leaf

delete

Example: delete 3-leaf

delete

Example: 2-leaf, 3-sibling

delete

Example: 2-leaf, 2-sibling, 3-parent

delete

Example: 2-leaf, 2-sibling, 2-parent

delete

Example deleting

delete
delete
delete
delete

Other Operation

find(), min(), max(), pred() and succ() are implemented as for regular BST.

Red-Black Trees

Red-Black: Efficient Implementation Of 2-3 Trees

Red node is just a part of Black Node
nodes

Simper Solution And Faster Balanced BST

  1. treap(tree + heap): a randomized BST.
  2. splay tree: an amortized BST.
  3. skip list: a randomized linked list.
    details

Splay Trees

Overview

  1. A variant of BST that has worst-case running time of O(mlgn) for any sequence of m BST operations, n of which are insertions.
  2. Each regular BST operation is followed by a splay operation, which moves the most recently accessed element to the root.

Splay Operation

splay operation

Basic Step

Zig

Zig

Zag

Zag

ZigZig

ZigZig

ZagZag

ZagZag

ZigZag

ZigZag

ZagZig

ZagZig

Example: Insertion

insert
insert
insert
insert

Example: Deletion

delete
delete
delete
delete

Normal Analysis

analysis

Amortized Analysis

analysis

Amortized Cost Of A Splay Operation Is At Most 1 + 3lg(n)

Concave Functions

concave functions

lg(x) + lg(y) ≤ 2lg((x+y)/2)

(lg(x)+lg(y))/2 ≤ lg((x+y)/2)
lg(n)

Potential Function

potential

Example

example

Plan

plan

Notation

notation

Zag and Zig

C’zig(x) = C’zag(x) ≤ 1 + r’(x) - r(x)
zag&zig
actual cost = rotation times

Zigzag and Zagzig

C’zigzag(x) = C’zagzig(x) ≤ 2(r’(x) - r(x))
zigzag

Zigzig and Zagzag

C’zagzag(x) = C’zigzig(x) ≤ 3(r’(x) - r(x))
zigzig

Total Amortized Cost

amortized cost

Treap

Overview

overview

Treap Representations

Normal BST

BST

Treap = BST + max-heap

Treap

Unique Representation

Unique

Treap Operation

  1. max(), min(), pred(), succ(), find(): same for BST;
  2. insert(): at a leaf as for BST, followed by a sequence of rotations up to restore heap order;
  3. erase(): reverse of insert(), i,e.,a sequence of rotations down to a leaf while maintaining heap order(rotate the child wiht larget priority up), followed by removal.

Example: Insertion

insertion
insertion
insertion

Example: Deletion

deletion
deletion
deletion
deletion

Analysis

analysis

Notation

notation

Key lemma

lemma

Expect Value

E[Ai]=⍬(lgn) 1≤i≤n
E[A]

Expect depth and size

E[D]&E[S]

Left And Right Spine

SL&SR

Left Rotation Shortens Right Spine

LR short RS

Expect Length Of Left Spine Of Xi

E[SLi]

P(Ci,j,k = 1)

P(Ci,j,k = 1)

E[SLi] = 1 - 1/i

E[SLi]

Same with E[SLi], E[SRi] = 1 - 1 /(n+1-i)

Expection Running Time Of Treap Operations

expect running time
The Rotation in insertion and erase is at most O(2)

Skip List

Overview

overview

Motivation

motivation

Worst-Analysis

analysis

Insertion&Deletion

insert&delete

Skip List Operations

operations

Example: Insertion

insertion
insertion
insertion
insertion

Example: Deletion

deletion

Analysis

Notation

  • S: size of skip lists for a given set of n keys
  • Si: size of level-i list(bottom list is at level 0)
  • H: number of levels
  • Top: number of comparisons by operation op()

Expection Of Si

E[Si]

Expection Of S

E[S]

Expection Of H

E[H]
∑ n/2i = ∑ n * 1/2lgn * 1/2i = ∑1/2i = 1

Expection of Top

E[U]
E[L]
A find need a horizontal move only when the before nodes toss head. Head means they will build tower.
E[Tfind]

Minimum-Weight Spanning Trees

Definition

definition

Example

example
example

Facts

  • G has a spanning tree iff it is connected
  • Let T be a graph with n vertices and m edges. The following are equivalent
    • T is a tree
    • T is connected and n = m + 1
    • T is acyclic and n = m + 1
    • there is a unique path between any two vertices in T.
  • adding an edge between two existing vertices of a tree creates a cycle; removing an edge from this cycle yields back a tree.

Theorem: Unique weights means unique MST

proof

Converse: Unique MST doesn’t means unique weights

That is beacuse if n = m + 1, we just has one subgraph.

Key assumption

assumption

MST Algorithms: Kruskal, Prim, Boruvka

Kruskal’s algorithm

Code

code

Example

example
example

Analysis
  • Naive implementation using BFS/DFS: O(m(n+m))
  • Using disjoint sets: O(mlgn)

Edge Contraction

edge

Prim’s algorithm

Code

code

Example

example

Analysis
  • Navie implementation using edge contraction is expensive
  • Implementation using a min-heap takes O(mlgn) time
  • Implementation using a Fibonacci heap takes O(m + nlgn) time

__Note Why O(m + nlgn) < O(mlgn)__
Here m is the vertexes, n is the edges
since n2 ≥ m ≥ n-1, in worst-case every node is connected, so m = n2 in that case, O(n2 + nlgn) < O(n2lgn)

Boruvka’s algorithm

Code

code

Example

example
example

Analysis
  • takes O(lgn) rounds, since the number of components is at least halved each round.

Disjoint Sets

Overview

A data structure to maintain a collection of sets of objects, Supported operations are:

  • make_set(x): create a new set with one element x;
  • find_set(px): returns the set containing x, given its pointer;
  • join_sets(px, py): combine the sets containing x and y, given their pointers.

Representation

representation

Implementation

Node

node

DS

ds

make_set()

make_set

find_set()

find_set

join_set()

join_set

Application

Kruskal’s algorithm

Kruskal

Example

example

Analysis

  • make_set() and join_sets() runs in time O(1)
  • find_set() runs in time O(h), where h is the maximun tree height in the collection
  • We will show that h ≤ O(lgn), where n is the total number of nodes

rank(x) = height(x)

2 ^ rank(x) ≤ size(x)

At Most n/2r Nodes Of Rank r

Path Compression Version

Overview

overview

Find_set()

find_set

Analysis of path compression

analysis

lg*(n) function

lg-star-n

Proof

proof

Edge Classification

edge calssification

Number of Type1 and Type2

type1&2

Prepaying for Type3

type3

Total Prepayment

prepayment

Total Running time

total

A Tighter Bound Is Possible

Using a more sophisticated prepayment method, it is possible to show that the worst-case running time of m disjoint-sets operations is O(m𝓪(m,n)), where 𝓪(,) is the inverse Ackermann function, as even slower-growing function.

Heap

Prim’s Algorithm Based On Heap

implementation

Analysis

analysis

Binomial Heaps

Binomial Trees

definition

Property

property

Merge

merge

Definition

definition
example

Properties

Let H be a binomial heaps

  • H has height at most lgn, since a binomial tree of height k has 2^k nodes;
  • H has at most 1 + lgn trees, since the trees have distinct ranks at most lgn;
  • the rank of each tree in H is at most lgn

Operations

  1. make_heap(x): create a heap with a single key;
  2. min(): return the minimum key in heap;
  3. meld(h): merge heap h with current heap(h is destroyed);
  4. insert(x): add key x to current heap;
  5. delete_min(): remove the minimum key in heap;
  6. decrease_key(x, x’): reduces key x in current heap to x’.

decrease_key(x, x’)

decrease_key

meld(h)

Lazy Version

lazy

Eager Version

eager

insert(x)

insert

delete_min()

delete_min
example

Summary

Operation Worst Running Time
make_heap O(1)
min O(1)
meld O(lgn) or O(1)
insert O(lgn) or O(1)
delete_min O(lgn)
decrease_key O(lgn)

Fibonacci Heaps

Fibonacci Number

F0 = 0
F1 = 1
Fn = Fn-1 + Fn-2, n ≥ 2.
F = {0, 1, 1, 2, 3, 5, 8, 13, 21, …}

Facts

  1. fact1
  2. fact2

Fibonacci Heaps

fHeaps
if its parent is root level, not mark its parent.
example

Amortized Analysis

potential

Make_Heap

make_heap

Min

min

Meld

meld

Insert

insert

Delete_min

insert

Decrease_key

derease_key

Summary

Operation Worst Running Time
make_heap O(1)
min O(1)
meld O(1)
insert O(1)
delete_min O(lgn)
decrease_key O(1)

D(n) ≤ O(lgn)

lemma

Sk: minimum tree size rooted at a node of rank k

Let size(x) be the number of nodes in the subtree rooted at x, including itself, and let s(k) be the minimum value of size(x) for node x of rank k in any Fibonacci heap.

Sk ≥ Fk+2 ≥ Φn

proof
proof

Corollary

corollary

String Matching Problem

Definition

definition

brute_force

Smart Solutions

solutions

Karp-Rabin

Overview

  • consists of Compare-Shift phases as brute-force method;
  • always shifts by one;
  • speeds up compare by treating strings as integers in base d, where d is the alphabet size;

    String as Integer

    transfer
    transfer

Code

code

Analysis

θ(m)+θ(n)+θ(n) = θ(n+m) arithmetic operations
Each arithmetic operation cost Ω(m) (integers have size m)
So the total time is Ω(nm), no better than brute-force!

New Ideas

Idea #1

idea1
Modification
Accept false positives but keep the error probability negligible using randomization, just like miller-rabin().

Code

code

Analysis

analysis

Prime Number Theorem

prime number

Proof of error bound

error bound

Amplification

amplification

Fundamental Pre-processing: Z-Values

Overview

overview

Z-values

z-values

In Details

details

Computing z-values In Linear Time

case 1: Brute Force

bf

case 2: deducing Zk from a previous z-value

deduing

case 3: deducing Zk from a previous z-value

deduing

Code

code

Analysis

analysis

Example

example

A linear-time string matching algorithm

algorithm

Why space is O(m)

only refer the z[k-l] in algorithm
reason

N-value

n-value

KMP

Overview

overview

k-values

k-values
A proper suffix of a string is not equal to the string itself.

In Details

details

Smarter Shifting

case

k’-values

k'-values

In Details

details

Property of K/K’-Values

property

Computing K/K’-values In Linear Time

K’-Values

computing

Example

example

K-Values

computing

A linear-time string matching algorithm-KMP

algorithm

Example

example

Analysis

analysis

Real-time Version of KMP

problem

K’i,x

K'i,x

BOYER-MOORE(BM) Algorithm

Overview

overview

Notation

notation

Bad Character Rule

bad character rule

Good Suffix Rule

Rule 1

rule1

Rule 2

rule2

Rule 3

rule3

BOYER-MOORE(BM): Complete Shifting Rule

shifting rule

Implementation Of Bad Character Rule

R[c]: rightmost position of c

R[c]

Code Implementation

code

Sb: Shift amount of the bad character rule

cases

Case 1

case1

Case 2

case2

Case 3 - Back To Brute Force

case3

General Formula

Sb = max(m-1-l-R[t[s-l]], 1)

BOYER-MOORE(BM) Implementation: bad character rule only

Implementation

Example

example

Horspool algorithm

Horspool

Implementation Of Good Suffix Rule

Case 1

case1

N Value

N Value

L’ Value

L' Value
L' Value

Case 2

case2

I Value

I Value

Computing

computing

Code

code

Case 3

Neither case 1 or 2 applies: shift by m

Implementation Of BOYER-MOORE(BM) Algorithm

code

Example

L&I Value
BM

Suffix Trees

Definition

definition

Example: banana

example

Example: abaab

example

Why $?

reason

Properties

properties
proof
m = i + n - 1 since each node has one input except root node
m > 2i since the internal node in suffix tree must has at least 2 output

String Matching With Suffix Tree

theorem

Suffix Arrays

Definition

definition

Longest Common Prefix Array

LCP

String Matching With Suffix Array

theorem

String Matching With Suffix Array And LCP Arrays

theorem

Linear-time conversions between suffix trees and suffix arrays

Suffix Tree to Suffix Arrays

transfer
example

Suffix Arrays to Suffix Tree

transfer

Linear-time get Suffix Arrays

Inverse Suffix Array AS

AS
code

SA to LCP

code

Example: banana

example

points

  1. The index of AS is the start index of substring in S. Therefore, the string which start from i is the substring of the string which start from i-1.
  2. If LCP[i-1] = l, which is the length of LCP of i-1. When we are comput LCP[i], there must has an string which has l-1 length common prefeix with string start from i.
    For example, banana
    when i = 1, we get LCP[1] = 3, since the length of common prefix of “anana” and “ana” is 3.
    when i = 2, the substring is “nana”, there must has an string which has 2 length common prefix with “nana” that is “na”. The reson is the lcp of “anana” and “ana” is 3. The substring of each is “nana” and “na”, they must has at least 3-1 length common prefeix. That is why we can compare from LCP[i-1]-1.
  3. Because index is from 0 to n, the substring is from long to short. And from the SA rank policy, if one string is another’s prefix, the longger is in the back. So when we computing LCP from SA, the longer string will be computed first, and its LCP is guaranteed by the l value. And by the l value reducing, the lcp of shorter string which has higher rank in LCP will be smaller even 0, which is in compliance with SA rules.

Analysis

comput_LCP() makes at most 2n comparisons

  • Each comparison ends in a match or a mismatch.
  • Each mismatch decrements l by 1. There are n-1 mismatchs. That is because rank 0 don’t need match.
  • Each match increments l by 1. Since l is bounded above by n. at most 2n matches occur.
    match - distmatch ≤ n => match ≤ dismatch + n = 2n-1
  • Since l is bounded from above by n and is decreased at most n-1 times, the total running time is at most 2n.
    l increase from 0 to n and decrease from n to 0 needs 2n time

Correctness

correctness

Linear-time Construction Of Suffix Arrays

Skew Algorithm

algorithm

Example: mississippi

example

SA12 to SA0

example
m+INV[1] = “m3”, p+INV[10] = “p0”, s+INV[7] = “s1”, s+INV[4] = “s2”, then we sort this four string get the rank of SA0 which need linear time, because each string just has two digit.

Construct SA12 recursively

recursively

Change back

back

Constuct SA0

bucketSort

Merging SA0 and SA12 In Linear Time

bucketSort

Example

example

Contents
  1. 1. Advanced Algorithm
    1. 1.1. Course preview
      1. 1.1.1. Lecture schedule
      2. 1.1.2. Summary
    2. 1.2. Primarily Testing
      1. 1.2.1. Problem defination
      2. 1.2.2. brute-force solution
    3. 1.3. Dictionary Problem
      1. 1.3.1. Problem defination
      2. 1.3.2. O(1) Time & O(n) Space solution
      3. 1.3.3. BST solution
      4. 1.3.4. New solution
    4. 1.4. Minimum Spanning Trees
      1. 1.4.1. Problem defination
      2. 1.4.2. Kruskal algorithm
      3. 1.4.3. Algorithm introduction
      4. 1.4.4. Prim algorithm
      5. 1.4.5. Algorithm introduction
      6. 1.4.6. Disjoint set
      7. 1.4.7. Fibonacci heap
    5. 1.5. String Matching
      1. 1.5.1. Problem defination
      2. 1.5.2. Classic solutions
    6. 1.6. Generating Random Sequences Of Integers
      1. 1.6.1. Desirable properites of rand(n)
      2. 1.6.2. Computing ∏ using rand()
      3. 1.6.3. Implementation
      4. 1.6.4. Sample run
  2. 2. Randomized Algorithm
    1. 2.1. Review of probability theory
      1. 2.1.1. Random Variables
      2. 2.1.2. Expected Value
      3. 2.1.3. Linearity Of Expected Values
      4. 2.1.4. Indicator Random Variables
      5. 2.1.5. Binomial Random Variable Bn,p
      6. 2.1.6. An Infinite Sum
      7. 2.1.7. Geometric random variables Gp
      8. 2.1.8. Harmonic Numbers
    2. 2.2. Randomized Searching
      1. 2.2.1. Problem defination
      2. 2.2.2. Deterministic Solution
      3. 2.2.3. Solution Analysis
      4. 2.2.4. Randomized Solution
      5. 2.2.5. Solution Analysis
      6. 2.2.6. Average time
      7. 2.2.7. Example-Coupon Collection
    3. 2.3. Randomized Sorting
      1. 2.3.1. Problem defination
      2. 2.3.2. Deterministic Quicksort
      3. 2.3.3. Solution Analysis
      4. 2.3.4. Randomized Quicksort
      5. 2.3.5. Expected running time analysis
        1. 2.3.5.1. When is Ci,j is 1?
        2. 2.3.5.2. Using linearity of expection to compute E[C]
    4. 2.4. Randomized Selection
      1. 2.4.1. Problem defination
      2. 2.4.2. Deterministic QuickSelect
      3. 2.4.3. Solution Analysis
      4. 2.4.4. Randomized QuickSelect
      5. 2.4.5. Solution Analysis
        1. 2.4.5.1. when Ci,j = 1?
          1. 2.4.5.1.1. summing over all k <=i < j <= n
          2. 2.4.5.1.2. summing over all 1 <=i < j <= k
          3. 2.4.5.1.3. summing over all 1 <= i < k < j <= n
        2. 2.4.5.2. E[C] < θ(n)
        3. 2.4.5.3. Esxactly Formula Of E[C]
    5. 2.5. Primality Testing
      1. 2.5.1. Problem Defination
      2. 2.5.2. Upper Bound
      3. 2.5.3. Randomized Solution:Fermat test
        1. 2.5.3.1. Theorem
        2. 2.5.3.2. Theorem Proof
        3. 2.5.3.3. New Idea
        4. 2.5.3.4. Implementation
        5. 2.5.3.5. Error Situation
        6. 2.5.3.6. Error Probability
        7. 2.5.3.7. Reduce Error Probability from 1/2 to insignificant
        8. 2.5.3.8. Carmichael numbers
      4. 2.5.4. Randomized Solution:Miller-Rabin
        1. 2.5.4.1. Theorem
        2. 2.5.4.2. Modified Theorem
        3. 2.5.4.3. Example
        4. 2.5.4.4. Implementation Idea
        5. 2.5.4.5. Solution Analysis
        6. 2.5.4.6. Error Situation
        7. 2.5.4.7. Error Probability
        8. 2.5.4.8. Reduce Error Probability from 1/4 to insignificant
  3. 3. Amortized Analysis
    1. 3.1. What Is Amortized Analysis
      1. 3.1.1. A Toy Example: The K-bit Binary Counter
        1. 3.1.1.1. Counter Data Structure
        2. 3.1.1.2. Traditional Analysis
        3. 3.1.1.3. Amortized Analysis
          1. 3.1.1.3.1. Aggregate counting
          2. 3.1.1.3.2. Funny accounting
          3. 3.1.1.3.3. Potential function
          4. 3.1.1.3.4. How to pick φ?
      2. 3.1.2. Dynamic Stacks
        1. 3.1.2.1. Data Structure
        2. 3.1.2.2. Implementation
          1. 3.1.2.2.1. Definition
          2. 3.1.2.2.2. Constructor
          3. 3.1.2.2.3. Push
          4. 3.1.2.2.4. Pop
        3. 3.1.2.3. Traditional Analysis
        4. 3.1.2.4. Amortized Analysis
          1. 3.1.2.4.1. Example: m pushes
            1. 3.1.2.4.1.1. Aggregate counting
            2. 3.1.2.4.1.2. Funny accounting
            3. 3.1.2.4.1.3. Potential function
          2. 3.1.2.4.2. Example: general case of m push/pop operations
            1. 3.1.2.4.2.1. Potential function
  4. 4. UNIVERSAL HASHING
    1. 4.1. Dictionary Problem
      1. 4.1.1. Problem Definition
      2. 4.1.2. Royal Solution: Use An Array Of Size NKEYS
        1. 4.1.2.1. Implementation
        2. 4.1.2.2. Analysis
      3. 4.1.3. Compromise Solution: Hash Table
        1. 4.1.3.1. Implementation
        2. 4.1.3.2. Worst-case Analysis
        3. 4.1.3.3. Average-Case Analysis: chain hashing
        4. 4.1.3.4. 2-Universal Property
        5. 4.1.3.5. Expect Running Time
        6. 4.1.3.6. Expect Running Time Of A SEQUENCE Of Hash Operation
    2. 4.2. A Universal Hash Function Family
      1. 4.2.1. Universal Property
      2. 4.2.2. A Much Smaller Family With Universal Hashing Property
        1. 4.2.2.1. Example
        2. 4.2.2.2. Proof Of Universal Hashing Property
    3. 4.3. Perfect Hashing
      1. 4.3.1. Perfect Definition
      2. 4.3.2. Markov’s inequality
      3. 4.3.3. Perfect Hashing using quadratic space
      4. 4.3.4. Perfect Hashing using linear space
        1. 4.3.4.1. First Step
        2. 4.3.4.2. Space Analysis
  5. 5. Binary Search Tree
    1. 5.1. Extended Dictionary Problam
      1. 5.1.1. Classical Solution: Binary Search Tree
        1. 5.1.1.1. Implementation
          1. 5.1.1.1.1. Class Definition
          2. 5.1.1.1.2. min & max function
          3. 5.1.1.1.3. find function
          4. 5.1.1.1.4. insert function
          5. 5.1.1.1.5. delete_min function
          6. 5.1.1.1.6. delete function
          7. 5.1.1.1.7. predecessor function
        2. 5.1.1.2. Example
          1. 5.1.1.2.1. Insert
          2. 5.1.1.2.2. delete
        3. 5.1.1.3. Analysis
  6. 6. Balanced BST
    1. 6.1. Basic Operation: rotation
    2. 6.2. 2-3 Trees
      1. 6.2.1. Nodes
        1. 6.2.1.1. Insert
          1. 6.2.1.1.1. Always Insert At A Leaf
          2. 6.2.1.1.2. Example: inserting 2-leaf
          3. 6.2.1.1.3. Example: inserting 3-leaf
          4. 6.2.1.1.4. Example: inserting 3-leaf recursively
          5. 6.2.1.1.5. Example inserting
        2. 6.2.1.2. Delete
          1. 6.2.1.2.1. Always Delete At A Leaf
          2. 6.2.1.2.2. Example: delete 3-leaf
          3. 6.2.1.2.3. Example: 2-leaf, 3-sibling
          4. 6.2.1.2.4. Example: 2-leaf, 2-sibling, 3-parent
          5. 6.2.1.2.5. Example: 2-leaf, 2-sibling, 2-parent
          6. 6.2.1.2.6. Example deleting
        3. 6.2.1.3. Other Operation
    3. 6.3. Red-Black Trees
      1. 6.3.1. Red-Black: Efficient Implementation Of 2-3 Trees
    4. 6.4. Simper Solution And Faster Balanced BST
    5. 6.5. Splay Trees
      1. 6.5.1. Overview
      2. 6.5.2. Splay Operation
      3. 6.5.3. Basic Step
        1. 6.5.3.1. Zig
        2. 6.5.3.2. Zag
        3. 6.5.3.3. ZigZig
        4. 6.5.3.4. ZagZag
        5. 6.5.3.5. ZigZag
        6. 6.5.3.6. ZagZig
      4. 6.5.4. Example: Insertion
      5. 6.5.5. Example: Deletion
      6. 6.5.6. Normal Analysis
      7. 6.5.7. Amortized Analysis
        1. 6.5.7.1. Amortized Cost Of A Splay Operation Is At Most 1 + 3lg(n)
          1. 6.5.7.1.1. Concave Functions
          2. 6.5.7.1.2. lg(x) + lg(y) ≤ 2lg((x+y)/2)
          3. 6.5.7.1.3. Potential Function
          4. 6.5.7.1.4. Example
          5. 6.5.7.1.5. Plan
            1. 6.5.7.1.5.1. Notation
            2. 6.5.7.1.5.2. Zag and Zig
            3. 6.5.7.1.5.3. Zigzag and Zagzig
            4. 6.5.7.1.5.4. Zigzig and Zagzag
            5. 6.5.7.1.5.5. Total Amortized Cost
    6. 6.6. Treap
      1. 6.6.1. Overview
      2. 6.6.2. Treap Representations
        1. 6.6.2.1. Normal BST
        2. 6.6.2.2. Treap = BST + max-heap
      3. 6.6.3. Unique Representation
      4. 6.6.4. Treap Operation
      5. 6.6.5. Example: Insertion
      6. 6.6.6. Example: Deletion
      7. 6.6.7. Analysis
      8. 6.6.8. Notation
      9. 6.6.9. Key lemma
      10. 6.6.10. Expect Value
      11. 6.6.11. Expect depth and size
      12. 6.6.12. Left And Right Spine
      13. 6.6.13. Left Rotation Shortens Right Spine
      14. 6.6.14. Expect Length Of Left Spine Of Xi
      15. 6.6.15. P(Ci,j,k = 1)
      16. 6.6.16. E[SLi] = 1 - 1/i
      17. 6.6.17. Expection Running Time Of Treap Operations
    7. 6.7. Skip List
      1. 6.7.1. Overview
      2. 6.7.2. Motivation
      3. 6.7.3. Worst-Analysis
      4. 6.7.4. Insertion&Deletion
      5. 6.7.5. Skip List Operations
      6. 6.7.6. Example: Insertion
      7. 6.7.7. Example: Deletion
      8. 6.7.8. Analysis
        1. 6.7.8.1. Notation
        2. 6.7.8.2. Expection Of Si
        3. 6.7.8.3. Expection Of S
        4. 6.7.8.4. Expection Of H
        5. 6.7.8.5. Expection of Top
  7. 7. Minimum-Weight Spanning Trees
    1. 7.1. Definition
      1. 7.1.1. Example
    2. 7.2. Facts
    3. 7.3. Theorem: Unique weights means unique MST
      1. 7.3.1. Converse: Unique MST doesn’t means unique weights
    4. 7.4. Key assumption
      1. 7.4.1. MST Algorithms: Kruskal, Prim, Boruvka
        1. 7.4.1.1. Kruskal’s algorithm
          1. 7.4.1.1.1. Code
          2. 7.4.1.1.2. Example
          3. 7.4.1.1.3. Analysis
        2. 7.4.1.2. Edge Contraction
        3. 7.4.1.3. Prim’s algorithm
          1. 7.4.1.3.1. Code
          2. 7.4.1.3.2. Example
          3. 7.4.1.3.3. Analysis
        4. 7.4.1.4. Boruvka’s algorithm
          1. 7.4.1.4.1. Code
          2. 7.4.1.4.2. Example
          3. 7.4.1.4.3. Analysis
  8. 8. Disjoint Sets
    1. 8.1. Overview
    2. 8.2. Representation
    3. 8.3. Implementation
      1. 8.3.1. Node
      2. 8.3.2. DS
      3. 8.3.3. make_set()
      4. 8.3.4. find_set()
      5. 8.3.5. join_set()
    4. 8.4. Application
      1. 8.4.1. Kruskal’s algorithm
        1. 8.4.1.1. Example
        2. 8.4.1.2. Analysis
        3. 8.4.1.3. rank(x) = height(x)
        4. 8.4.1.4. 2 ^ rank(x) ≤ size(x)
        5. 8.4.1.5. At Most n/2r Nodes Of Rank r
    5. 8.5. Path Compression Version
      1. 8.5.1. Overview
      2. 8.5.2. Find_set()
      3. 8.5.3. Analysis of path compression
      4. 8.5.4. lg*(n) function
      5. 8.5.5. Proof
      6. 8.5.6. Edge Classification
      7. 8.5.7. Number of Type1 and Type2
      8. 8.5.8. Prepaying for Type3
        1. 8.5.8.1. Total Prepayment
      9. 8.5.9. Total Running time
      10. 8.5.10. A Tighter Bound Is Possible
  9. 9. Heap
    1. 9.1. Prim’s Algorithm Based On Heap
      1. 9.1.1. Analysis
    2. 9.2. Binomial Heaps
      1. 9.2.1. Binomial Trees
      2. 9.2.2. Property
      3. 9.2.3. Merge
      4. 9.2.4. Definition
      5. 9.2.5. Properties
      6. 9.2.6. Operations
      7. 9.2.7. decrease_key(x, x’)
      8. 9.2.8. meld(h)
        1. 9.2.8.1. Lazy Version
        2. 9.2.8.2. Eager Version
      9. 9.2.9. insert(x)
      10. 9.2.10. delete_min()
      11. 9.2.11. Summary
    3. 9.3. Fibonacci Heaps
      1. 9.3.1. Fibonacci Number
      2. 9.3.2. Facts
      3. 9.3.3. Fibonacci Heaps
      4. 9.3.4. Amortized Analysis
        1. 9.3.4.1. Make_Heap
        2. 9.3.4.2. Min
        3. 9.3.4.3. Meld
        4. 9.3.4.4. Insert
        5. 9.3.4.5. Delete_min
        6. 9.3.4.6. Decrease_key
      5. 9.3.5. Summary
      6. 9.3.6. D(n) ≤ O(lgn)
      7. 9.3.7. Sk: minimum tree size rooted at a node of rank k
      8. 9.3.8. Sk ≥ Fk+2 ≥ Φn
      9. 9.3.9. Corollary
  10. 10. String Matching Problem
    1. 10.1. Definition
    2. 10.2. Exhaustive Search
    3. 10.3. Smart Solutions
    4. 10.4. Karp-Rabin
      1. 10.4.1. Overview
      2. 10.4.2. String as Integer
      3. 10.4.3. Code
      4. 10.4.4. Analysis
      5. 10.4.5. New Ideas
        1. 10.4.5.1. Idea #1
        2. 10.4.5.2. Code
        3. 10.4.5.3. Analysis
        4. 10.4.5.4. Prime Number Theorem
        5. 10.4.5.5. Proof of error bound
        6. 10.4.5.6. Amplification
    5. 10.5. Fundamental Pre-processing: Z-Values
      1. 10.5.1. Overview
      2. 10.5.2. Z-values
        1. 10.5.2.1. In Details
      3. 10.5.3. Computing z-values In Linear Time
        1. 10.5.3.1. case 1: Brute Force
        2. 10.5.3.2. case 2: deducing Zk from a previous z-value
        3. 10.5.3.3. case 3: deducing Zk from a previous z-value
      4. 10.5.4. Code
      5. 10.5.5. Analysis
      6. 10.5.6. Example
      7. 10.5.7. A linear-time string matching algorithm
        1. 10.5.7.1. Why space is O(m)
      8. 10.5.8. N-value
    6. 10.6. KMP
      1. 10.6.1. Overview
      2. 10.6.2. k-values
        1. 10.6.2.1. In Details
      3. 10.6.3. Smarter Shifting
      4. 10.6.4. k’-values
        1. 10.6.4.1. In Details
      5. 10.6.5. Property of K/K’-Values
      6. 10.6.6. Computing K/K’-values In Linear Time
        1. 10.6.6.1. K’-Values
        2. 10.6.6.2. Example
        3. 10.6.6.3. K-Values
      7. 10.6.7. A linear-time string matching algorithm-KMP
        1. 10.6.7.1. Example
        2. 10.6.7.2. Analysis
      8. 10.6.8. Real-time Version of KMP
        1. 10.6.8.1. K’i,x
    7. 10.7. BOYER-MOORE(BM) Algorithm
      1. 10.7.1. Overview
      2. 10.7.2. Notation
      3. 10.7.3. Bad Character Rule
      4. 10.7.4. Good Suffix Rule
        1. 10.7.4.1. Rule 1
        2. 10.7.4.2. Rule 2
        3. 10.7.4.3. Rule 3
      5. 10.7.5. BOYER-MOORE(BM): Complete Shifting Rule
      6. 10.7.6. Implementation Of Bad Character Rule
        1. 10.7.6.1. R[c]: rightmost position of c
          1. 10.7.6.1.1. Code Implementation
        2. 10.7.6.2. Sb: Shift amount of the bad character rule
          1. 10.7.6.2.1. Case 1
          2. 10.7.6.2.2. Case 2
          3. 10.7.6.2.3. Case 3 - Back To Brute Force
          4. 10.7.6.2.4. General Formula
      7. 10.7.7. BOYER-MOORE(BM) Implementation: bad character rule only
        1. 10.7.7.1. Example
        2. 10.7.7.2. Horspool algorithm
      8. 10.7.8. Implementation Of Good Suffix Rule
        1. 10.7.8.1. Case 1
        2. 10.7.8.2. N Value
        3. 10.7.8.3. L’ Value
        4. 10.7.8.4. Case 2
        5. 10.7.8.5. I Value
          1. 10.7.8.5.1. Computing
          2. 10.7.8.5.2. Code
        6. 10.7.8.6. Case 3
      9. 10.7.9. Implementation Of BOYER-MOORE(BM) Algorithm
      10. 10.7.10. Example
    8. 10.8. Suffix Trees
      1. 10.8.1. Definition
        1. 10.8.1.1. Example: banana
        2. 10.8.1.2. Example: abaab
        3. 10.8.1.3. Why $?
      2. 10.8.2. Properties
      3. 10.8.3. String Matching With Suffix Tree
    9. 10.9. Suffix Arrays
      1. 10.9.1. Definition
      2. 10.9.2. Longest Common Prefix Array
      3. 10.9.3. String Matching With Suffix Array
      4. 10.9.4. String Matching With Suffix Array And LCP Arrays
    10. 10.10. Linear-time conversions between suffix trees and suffix arrays
      1. 10.10.1. Suffix Tree to Suffix Arrays
      2. 10.10.2. Suffix Arrays to Suffix Tree
    11. 10.11. Linear-time get Suffix Arrays
      1. 10.11.1. Inverse Suffix Array AS
      2. 10.11.2. SA to LCP
        1. 10.11.2.1. Example: banana
        2. 10.11.2.2. Analysis
        3. 10.11.2.3. Correctness
      3. 10.11.3. Linear-time Construction Of Suffix Arrays
        1. 10.11.3.1. Skew Algorithm
        2. 10.11.3.2. Example: mississippi
        3. 10.11.3.3. SA12 to SA0
        4. 10.11.3.4. Construct SA12 recursively
          1. 10.11.3.4.1. Change back
        5. 10.11.3.5. Constuct SA0
      4. 10.11.4. Merging SA0 and SA12 In Linear Time
        1. 10.11.4.1. Example
|