Hashing & Data Structures
Master hashing algorithms, collision resolution, and efficient data storage techniques.
1. Introduction to Hashing
Hashing is a fundamental data structure technique that maps data of arbitrary size to fixed-size values, enabling efficient data retrieval and storage operations.
What is Hashing?
Hashing uses hash functions to convert keys into array indices, providing average O(1) time complexity for search, insert, and delete operations.
Features & Benefits
- Constant time average complexity for operations
- Efficient data retrieval and storage
- Foundation for databases and caches
- Essential for password storage and verification
- Used in compiler symbol tables
Hashing Visualization
Key → Hash Function → Index → Storage
2. Hash Functions
A hash function (h(k)) converts a key into an array index, ideally distributing keys uniformly across the hash table.
Characteristics of Good Hash Functions
- Uniform distribution across hash table
- Fast computation (constant time)
- Deterministic (same input → same output)
- Minimizes collisions between different keys
function hashFunction(key, tableSize) {
let hash = 0;
for (let i = 0; i < key.length; i++) {
hash = (hash * 31 + key.charCodeAt(i)) % tableSize;
}
return hash;
}
// Example usage
const index = hashFunction("example", 100); // Returns: 42
3. Collision Handling Techniques
Collisions occur when two different keys hash to the same index. Various techniques exist to handle these collisions efficiently.
A. Chaining (Separate Chaining)
Each table index stores a linked list of key-value pairs. When collisions occur, new elements are appended to the list.
| Advantages | Disadvantages |
|---|---|
| Simple to implement and understand | Extra memory for pointer storage |
| No clustering issues | Cache inefficient due to linked lists |
| Handles unlimited collisions gracefully | Worst-case performance degrades to O(n) |
B. Open Addressing
All keys are stored within the hash table itself. When collisions occur, the algorithm probes for the next available slot.
class HashTable {
constructor(size) {
this.table = new Array(size);
this.size = size;
}
insert(key, value) {
let index = this.hash(key);
// Linear probing
while (this.table[index] !== undefined) {
index = (index + 1) % this.size;
}
this.table[index] = { key, value };
}
hash(key) {
return key.length % this.size;
}
}
4. Load Factor & Rehashing
Load Factor (α)
The load factor measures how full the hash table is, calculated as:
α = n / m Where: n = number of keys stored m = hash table size
Load Factor Interactive Demo
Adjust the table size and items to see load factor changes:
Rehashing Process
When load factor exceeds threshold (typically 0.7-0.75), rehashing creates a larger table and reinserts all keys:
- Create new hash table with larger size (usually double)
- Recompute hash values for all existing keys
- Insert all key-value pairs into new table
- Replace old table with new table
5. Real-World Applications
Where Hashing is Used
- Databases: Indexing for fast record retrieval
- Caching Systems: Memcached, Redis for quick data access
- Password Storage: Secure hashing with salt (bcrypt, SHA-256)
- Compiler Design: Symbol tables for variable tracking
- Cryptography: Digital signatures, message integrity
- Deduplication: Identifying duplicate files or data
Complexity Analysis
| Operation | Average Case | Worst Case |
|---|---|---|
| Search | O(1) | O(n) |
| Insert | O(1) | O(n) |
| Delete | O(1) | O(n) |
Project: Implement a Hash Table
Build a complete hash table implementation with collision handling and rehashing functionality.
class HashTable {
constructor(initialSize = 16, loadFactorThreshold = 0.75) {
this.table = new Array(initialSize);
this.size = initialSize;
this.count = 0;
this.loadFactorThreshold = loadFactorThreshold;
}
hash(key) {
let hash = 0;
for (let i = 0; i < key.length; i++) {
hash = (hash * 31 + key.charCodeAt(i)) % this.size;
}
return hash;
}
insert(key, value) {
// Check load factor
if (this.count / this.size > this.loadFactorThreshold) {
this.rehash();
}
let index = this.hash(key);
// Handle collisions with chaining
if (!this.table[index]) {
this.table[index] = [];
}
// Check if key already exists
for (let pair of this.table[index]) {
if (pair[0] === key) {
pair[1] = value; // Update existing
return;
}
}
// Add new key-value pair
this.table[index].push([key, value]);
this.count++;
}
get(key) {
const index = this.hash(key);
if (!this.table[index]) return undefined;
for (let pair of this.table[index]) {
if (pair[0] === key) {
return pair[1];
}
}
return undefined;
}
rehash() {
const oldTable = this.table;
const oldSize = this.size;
// Double the size
this.size = oldSize * 2;
this.table = new Array(this.size);
this.count = 0;
// Reinsert all elements
for (let i = 0; i < oldSize; i++) {
if (oldTable[i]) {
for (let [key, value] of oldTable[i]) {
this.insert(key, value);
}
}
}
}
}
// Usage example
const hashTable = new HashTable();
hashTable.insert("name", "Alice");
hashTable.insert("age", 30);
console.log(hashTable.get("name")); // Output: "Alice"
