Hashing & Data Structures – Full-Stack

1. Introduction to Hashing

Hashing is a fundamental data structure technique that maps data of arbitrary size to fixed-size values, enabling efficient data retrieval and storage operations.

What is Hashing?

Hashing uses hash functions to convert keys into array indices, providing average O(1) time complexity for search, insert, and delete operations.

Features & Benefits

  • Constant time average complexity for operations
  • Efficient data retrieval and storage
  • Foundation for databases and caches
  • Essential for password storage and verification
  • Used in compiler symbol tables

Hashing Visualization

Key → Hash Function → Index → Storage

2. Hash Functions

A hash function (h(k)) converts a key into an array index, ideally distributing keys uniformly across the hash table.

Characteristics of Good Hash Functions

  • Uniform distribution across hash table
  • Fast computation (constant time)
  • Deterministic (same input → same output)
  • Minimizes collisions between different keys
Simple Hash Function Implementation
function hashFunction(key, tableSize) {
    let hash = 0;
    for (let i = 0; i < key.length; i++) {
        hash = (hash * 31 + key.charCodeAt(i)) % tableSize;
    }
    return hash;
}

// Example usage
const index = hashFunction("example", 100); // Returns: 42

3. Collision Handling Techniques

Collisions occur when two different keys hash to the same index. Various techniques exist to handle these collisions efficiently.

A. Chaining (Separate Chaining)

Each table index stores a linked list of key-value pairs. When collisions occur, new elements are appended to the list.

Advantages Disadvantages
Simple to implement and understand Extra memory for pointer storage
No clustering issues Cache inefficient due to linked lists
Handles unlimited collisions gracefully Worst-case performance degrades to O(n)

B. Open Addressing

All keys are stored within the hash table itself. When collisions occur, the algorithm probes for the next available slot.

Linear Probing Implementation
class HashTable {
    constructor(size) {
        this.table = new Array(size);
        this.size = size;
    }

    insert(key, value) {
        let index = this.hash(key);
        
        // Linear probing
        while (this.table[index] !== undefined) {
            index = (index + 1) % this.size;
        }
        
        this.table[index] = { key, value };
    }

    hash(key) {
        return key.length % this.size;
    }
}

4. Load Factor & Rehashing

Load Factor (α)

The load factor measures how full the hash table is, calculated as:

Load Factor Formula
α = n / m

Where:
n = number of keys stored
m = hash table size

Load Factor Interactive Demo

Adjust the table size and items to see load factor changes:

Table Size: 50
Items Stored: 35
Load Factor: 0.70

Rehashing Process

When load factor exceeds threshold (typically 0.7-0.75), rehashing creates a larger table and reinserts all keys:

  1. Create new hash table with larger size (usually double)
  2. Recompute hash values for all existing keys
  3. Insert all key-value pairs into new table
  4. Replace old table with new table

5. Real-World Applications

Where Hashing is Used

  • Databases: Indexing for fast record retrieval
  • Caching Systems: Memcached, Redis for quick data access
  • Password Storage: Secure hashing with salt (bcrypt, SHA-256)
  • Compiler Design: Symbol tables for variable tracking
  • Cryptography: Digital signatures, message integrity
  • Deduplication: Identifying duplicate files or data

Complexity Analysis

Operation Average Case Worst Case
Search O(1) O(n)
Insert O(1) O(n)
Delete O(1) O(n)

Project: Implement a Hash Table

Build a complete hash table implementation with collision handling and rehashing functionality.

Complete Hash Table Implementation
class HashTable {
    constructor(initialSize = 16, loadFactorThreshold = 0.75) {
        this.table = new Array(initialSize);
        this.size = initialSize;
        this.count = 0;
        this.loadFactorThreshold = loadFactorThreshold;
    }

    hash(key) {
        let hash = 0;
        for (let i = 0; i < key.length; i++) {
            hash = (hash * 31 + key.charCodeAt(i)) % this.size;
        }
        return hash;
    }

    insert(key, value) {
        // Check load factor
        if (this.count / this.size > this.loadFactorThreshold) {
            this.rehash();
        }
        
        let index = this.hash(key);
        
        // Handle collisions with chaining
        if (!this.table[index]) {
            this.table[index] = [];
        }
        
        // Check if key already exists
        for (let pair of this.table[index]) {
            if (pair[0] === key) {
                pair[1] = value; // Update existing
                return;
            }
        }
        
        // Add new key-value pair
        this.table[index].push([key, value]);
        this.count++;
    }

    get(key) {
        const index = this.hash(key);
        if (!this.table[index]) return undefined;
        
        for (let pair of this.table[index]) {
            if (pair[0] === key) {
                return pair[1];
            }
        }
        return undefined;
    }

    rehash() {
        const oldTable = this.table;
        const oldSize = this.size;
        
        // Double the size
        this.size = oldSize * 2;
        this.table = new Array(this.size);
        this.count = 0;
        
        // Reinsert all elements
        for (let i = 0; i < oldSize; i++) {
            if (oldTable[i]) {
                for (let [key, value] of oldTable[i]) {
                    this.insert(key, value);
                }
            }
        }
    }
}

// Usage example
const hashTable = new HashTable();
hashTable.insert("name", "Alice");
hashTable.insert("age", 30);
console.log(hashTable.get("name")); // Output: "Alice"