Understanding Hash Tables and Hash Functions*#
A hash table (or hash map) is a data structure that efficiently stores and retrieves key-value pairs using a hash function. The function converts a key into an index used to store and locate values quickly.
Real-World Use Cases of Hash Tables#
Hash tables are widely used in software development, databases, and networking due to their fast access times. Here are some key applications:
Databases and Caching
Indexing: Hash tables help optimize database search operations.
Caching: Used in applications like Redis and Memcached to store frequently accessed data.
Compilers and Interpreters
Symbol Tables: Stores variable names, function names, and their corresponding memory addresses.
Networking
Routing Tables: Hash tables map IP addresses to network devices for efficient packet routing.
DNS Resolution: Maps domain names to IP addresses.
Cybersecurity
Cryptographic Hashing: Used to store passwords securely (e.g., SHA-256, bcrypt).
Data Integrity Verification: Hash functions check if files have been altered.
Operating Systems
Process Scheduling: Hash tables store process states efficiently.
File Systems: Used in file indexing systems.
Machine Learning & AI
Feature Hashing: Converts large input spaces into a fixed-size representation.
Nearest Neighbor Search: Accelerates similarity searches.
Blockchain and Cryptography
Bitcoin and Ethereum use Merkle Trees, which rely on hash functions for data integrity.
Web Development
Session Management: Hash maps store user sessions efficiently.
Auto-Complete Features: Maps user input to predicted results.
Because of their efficiency and versatility, hash tables are a foundational data structure across various computing fields.
How Hash Tables Work#
Hashing Function: A function converts a key (e.g., a string or number) into an integer.
Indexing: The computed hash determines the storage location in an array.
Collision Handling: Since multiple keys can hash to the same index (collision), hash tables use methods like chaining or open addressing.
Understanding Hash Functions#
A hash function maps input data (keys) to an integer index in a hash table.
Example of a simple hash function in Java:
private int hash(String key) {
return Math.abs(key.hashCode()) % SIZE; // Uses Java's built-in hashCode()
}
The
hash()
method uses Java’shashCode()
function.The modulus operation ensures the index falls within the valid range of the array.
Key Properties of a Good Hash Function#
Deterministic: The same input always produces the same output.
Uniform Distribution: Keys should be distributed evenly across the table.
Efficient Computation: Fast calculation of hash values.
Minimal Collisions: Different keys should ideally generate unique outputs.
Collision Handling Methods#
Since different keys may hash to the same index, hash tables use collision resolution strategies.
1. Chaining (Separate Chaining)#
Each index in the table holds a linked list (or another data structure) to store multiple key-value pairs.
If a collision occurs, new elements are simply appended to the linked list at the index.
Chaining Implementation in Java#
import java.util.LinkedList;
class HashTableChaining {
private static final int SIZE = 10;
private LinkedList<Entry>[] table;
public HashTableChaining() {
table = new LinkedList[SIZE];
for (int i = 0; i < SIZE; i++) {
table[i] = new LinkedList<>();
}
}
private int hash(String key) {
return Math.abs(key.hashCode()) % SIZE;
}
public void insert(String key, String value) {
int index = hash(key);
for (Entry entry : table[index]) {
if (entry.key.equals(key)) {
entry.value = value;
return;
}
}
table[index].add(new Entry(key, value));
}
public String search(String key) {
int index = hash(key);
for (Entry entry : table[index]) {
if (entry.key.equals(key)) {
return entry.value;
}
}
return null;
}
public void delete(String key) {
int index = hash(key);
table[index].removeIf(entry -> entry.key.equals(key));
}
private static class Entry {
String key, value;
public Entry(String key, String value) { this.key = key; this.value = value; }
}
public static void main(String[] args) {
HashTableChaining ht = new HashTableChaining();
ht.insert("name", "Alice");
ht.insert("age", "25");
System.out.println(ht.search("name")); // Output: Alice
ht.delete("name");
System.out.println(ht.search("name")); // Output: null
}
}
2. Open Addressing#
Instead of using separate lists, open addressing stores all entries directly in the table. If a collision occurs, the algorithm searches for an available slot using a probing technique.
Types of Open Addressing#
Linear Probing: Check the next available slot sequentially.
Quadratic Probing: Check slots at increasing intervals (e.g., 1², 2², 3²…).
Double Hashing: Use a secondary hash function to find new slots.
Open Addressing Implementation in Java (Linear Probing)#
class HashTableOpenAddressing {
private static final int SIZE = 10;
private Entry[] table;
private static final Entry DELETED = new Entry(null, null); // Marker for deleted entries
public HashTableOpenAddressing() {
table = new Entry[SIZE];
}
private int hash(String key) {
return Math.abs(key.hashCode()) % SIZE;
}
public void insert(String key, String value) {
int index = hash(key);
while (table[index] != null && table[index] != DELETED) {
index = (index + 1) % SIZE; // Linear probing
}
table[index] = new Entry(key, value);
}
public String search(String key) {
int index = hash(key);
while (table[index] != null) {
if (table[index] != DELETED && table[index].key.equals(key)) {
return table[index].value;
}
index = (index + 1) % SIZE;
}
return null;
}
public void delete(String key) {
int index = hash(key);
while (table[index] != null) {
if (table[index].key.equals(key)) {
table[index] = DELETED; // Mark as deleted
return;
}
index = (index + 1) % SIZE;
}
}
private static class Entry {
String key, value;
public Entry(String key, String value) { this.key = key; this.value = value; }
}
public static void main(String[] args) {
HashTableOpenAddressing ht = new HashTableOpenAddressing();
ht.insert("name", "Alice");
ht.insert("age", "25");
System.out.println(ht.search("name")); // Output: Alice
ht.delete("name");
System.out.println(ht.search("name")); // Output: null
}
}
Choosing the Right Hashing Method#
For general hash tables:
MurmurHash
orFNV-1a
For security:
SHA-256
For simplicity:
DJB2
orRemainder
For handling collisions:
Use chaining when working with dynamic-sized tables.
Use open addressing when memory efficiency is critical.