Week 6 - Hash Tables #
Hashing #
Save items in a key-indexed table (index is a function of the key).
Hash function. Method for computing array index from key.
Issues.
- Computing the hash function.
- Equality test: Method for checking whether two keys are equal.
- Collision resolution: Algorithm and data structure to handle two keys that hash to the same array index.
Implementing hash code: strings
public final class String { private int hash = 0; private final char[] s; ... public int hashCode() { int h = hash; if(h != 0) return h; int hash = 0; for (int i = 0; i < length(); i++) hash = s[i] + (31 * hash); hash = h; return h; } }
Modular hashing
An int between -2^31 and 2^31 - 1.
Hash function. An int between 0 and M - 1 (for use as array index).
```java private int hash(Key key) { return (key.hashCode() & 0x7fffffff) % M;} ```
Collisions #
- Two distinct keys hashing to same index.
- Solutions: separate chaining and linear probing.
Collision resolution: Separate chaining symbol table #
- Use an array of M < N linked lists.
- Hash: map key to integer i between 0 and M - 1.
- Insert: put at front of i^th chain (if not already there).
- Search: need to search only i^th chain.
- Typical choice: M ~ N / 5 ⇒ constant-time ops.
Collision resolution: Open addressing #
- When a new key collides, find next empty slot, and put it there.
- Hash. Map key to integer i between 0 and M-1.
- Insert. Put at table index i if free; if not try i+1, i+2, etc.
- Search. Search table index i; if occupied but no match, try i+1, i+2, etc.
- Note. Array size M must be greater than number of key-value pairs N.
- Typical choice: α = N / M ~ ½.