Hashing - Full-Stack

1. Introduction to Hashing

Hashing is a fundamental data structure technique that maps data of arbitrary size to fixed-size values, enabling efficient data retrieval and storage operations.

What is Hashing?

Hashing uses hash functions to convert keys into array indices, providing average O(1) time complexity for search, insert, and delete operations.

Features & Benefits

Constant time average complexity for operations
Efficient data retrieval and storage
Foundation for databases and caches
Essential for password storage and verification
Used in compiler symbol tables

Hashing Visualization

Key → Hash Function → Index → Storage

2. Hash Functions

A hash function (h(k)) converts a key into an array index, ideally distributing keys uniformly across the hash table.

Characteristics of Good Hash Functions

Uniform distribution across hash table
Fast computation (constant time)
Deterministic (same input → same output)
Minimizes collisions between different keys

Simple Hash Function Implementation

function hashFunction(key, tableSize) {
    let hash = 0;
    for (let i = 0; i < key.length; i++) {
        hash = (hash * 31 + key.charCodeAt(i)) % tableSize;
    }
    return hash;
}

// Example usage
const index = hashFunction("example", 100); // Returns: 42

3. Collision Handling Techniques

Collisions occur when two different keys hash to the same index. Various techniques exist to handle these collisions efficiently.

A. Chaining (Separate Chaining)

Each table index stores a linked list of key-value pairs. When collisions occur, new elements are appended to the list.

Advantages	Disadvantages
Simple to implement and understand	Extra memory for pointer storage
No clustering issues	Cache inefficient due to linked lists
Handles unlimited collisions gracefully	Worst-case performance degrades to O(n)

B. Open Addressing

All keys are stored within the hash table itself. When collisions occur, the algorithm probes for the next available slot.

Linear Probing Implementation

class HashTable {
    constructor(size) {
        this.table = new Array(size);
        this.size = size;
    }

    insert(key, value) {
        let index = this.hash(key);
        
        // Linear probing
        while (this.table[index] !== undefined) {
            index = (index + 1) % this.size;
        }
        
        this.table[index] = { key, value };
    }

    hash(key) {
        return key.length % this.size;
    }
}

4. Load Factor & Rehashing

Load Factor (α)

The load factor measures how full the hash table is, calculated as:

Load Factor Formula

α = n / m

Where:
n = number of keys stored
m = hash table size

Load Factor Interactive Demo

Adjust the table size and items to see load factor changes:

Table Size: 50

Items Stored: 35

Load Factor: 0.70

Rehashing Process

When load factor exceeds threshold (typically 0.7-0.75), rehashing creates a larger table and reinserts all keys:

Create new hash table with larger size (usually double)
Recompute hash values for all existing keys
Insert all key-value pairs into new table
Replace old table with new table

5. Real-World Applications

Where Hashing is Used

Databases: Indexing for fast record retrieval
Caching Systems: Memcached, Redis for quick data access
Password Storage: Secure hashing with salt (bcrypt, SHA-256)
Compiler Design: Symbol tables for variable tracking
Cryptography: Digital signatures, message integrity
Deduplication: Identifying duplicate files or data

Complexity Analysis

Operation	Average Case	Worst Case
Search	O(1)	O(n)
Insert	O(1)	O(n)
Delete	O(1)	O(n)

Project: Implement a Hash Table

Build a complete hash table implementation with collision handling and rehashing functionality.

Complete Hash Table Implementation

class HashTable {
    constructor(initialSize = 16, loadFactorThreshold = 0.75) {
        this.table = new Array(initialSize);
        this.size = initialSize;
        this.count = 0;
        this.loadFactorThreshold = loadFactorThreshold;
    }

    hash(key) {
        let hash = 0;
        for (let i = 0; i < key.length; i++) {
            hash = (hash * 31 + key.charCodeAt(i)) % this.size;
        }
        return hash;
    }

    insert(key, value) {
        // Check load factor
        if (this.count / this.size > this.loadFactorThreshold) {
            this.rehash();
        }
        
        let index = this.hash(key);
        
        // Handle collisions with chaining
        if (!this.table[index]) {
            this.table[index] = [];
        }
        
        // Check if key already exists
        for (let pair of this.table[index]) {
            if (pair[0] === key) {
                pair[1] = value; // Update existing
                return;
            }
        }
        
        // Add new key-value pair
        this.table[index].push([key, value]);
        this.count++;
    }

    get(key) {
        const index = this.hash(key);
        if (!this.table[index]) return undefined;
        
        for (let pair of this.table[index]) {
            if (pair[0] === key) {
                return pair[1];
            }
        }
        return undefined;
    }

    rehash() {
        const oldTable = this.table;
        const oldSize = this.size;
        
        // Double the size
        this.size = oldSize * 2;
        this.table = new Array(this.size);
        this.count = 0;
        
        // Reinsert all elements
        for (let i = 0; i < oldSize; i++) {
            if (oldTable[i]) {
                for (let [key, value] of oldTable[i]) {
                    this.insert(key, value);
                }
            }
        }
    }
}

// Usage example
const hashTable = new HashTable();
hashTable.insert("name", "Alice");
hashTable.insert("age", 30);
console.log(hashTable.get("name")); // Output: "Alice"