Huffman Code#
What is the Huffman Algorithm?#
At its core, the Huffman Algorithm is a lossless data compression algorithm.1 The “lossless” part is key – it means that when we compress data using this method and then decompress it, we get back an exact copy of the original data.2 Nothing is lost in the process.
The genius of the Huffman Algorithm, developed by David A. Huffman while he was a Ph.D. student at MIT in 1952, lies in how it assigns codes to characters.3 Instead of using a fixed number of bits for every character (like in standard ASCII, which uses 7 or 8 bits per character), Huffman coding uses variable-length codes.4 The most frequently occurring characters get shorter codes, and the less frequent characters get longer codes.5 This ultimately leads to an overall reduction in the number of bits needed to represent the data.
Think of it like this: if you’re sending a message and the letter ‘e’ appears most often, wouldn’t it be more efficient to have a very short symbol for ‘e’, and perhaps a longer one for a rare letter like ‘z’? That’s the fundamental idea.
How Does It Work? The Encoding Process#
The Huffman Algorithm works by building a special type of binary tree called a Huffman Tree (or sometimes a prefix tree).6 Here’s a step-by-step breakdown:
Count Frequencies: First, we need to know how often each character (or symbol) appears in the data we want to compress. So, we scan the entire input data and count the occurrences of each unique character.
Example:
Let’s say our data is “ABRACADABRA”.
A: 5 times
B: 2 times
R: 2 times
C: 1 time
D: 1 time
Create Leaf Nodes: For each unique character, we create a “leaf node” (think of the bottom-most parts of a tree). Each leaf node will store the character itself and its frequency.
Build the Tree (The Greedy Part): This is where the “greedy” nature of the algorithm comes in.7 A greedy algorithm is one that makes the locally optimal choice at each stage with the hope of finding a global optimum.89
We treat our list of leaf nodes as a priority queue, where the priority is determined by the frequency (lowest frequency has the highest priority for being picked).
Repeat the following steps until only one node (the root of the Huffman Tree) remains:
a. Select Two Nodes: Extract the two nodes with the lowest frequencies from the queue.
b. Create a Parent Node: Create a new internal node (a node that is not a leaf). This new node will be the parent of the two nodes selected in step (a).
c. Assign Frequency to Parent: The frequency of this new parent node is the sum of the frequencies of its two children.10
d. Add to Queue: Insert this new parent node back into the priority queue.
Let’s visualize with our “ABRACADABRA” example:
Initial nodes (Character: Frequency): (C:1), (D:1), (B:2), (R:2), (A:5)
Step 1: Pick (C:1) and (D:1). Create parent (CD:2). Nodes: (CD:2), (B:2), (R:2), (A:5)
Step 2: Pick (CD:2) and (B:2) (or (R:2), order doesn’t matter for same frequencies here for the selection, though it might slightly change the tree structure but not the overall compression efficiency). Let’s pick (CD:2) and (B:2). Create parent (CDB:4). Nodes: (R:2), (CDB:4), (A:5)
Step 3: Pick (R:2) and (CDB:4). (Note: (R:2) has lower frequency). Create parent (RCDB:6). Nodes: (A:5), (RCDB:6)
Step 4: Pick (A:5) and (RCDB:6). Create parent (ARCDB:11). This is our root node.
Assign Codes: Once the Huffman Tree is built, we assign binary codes to each character.11 We do this by traversing the tree from the root to each leaf node:
Assign a ‘0’ to every left branch (or edge) in the tree.
Assign a ‘1’ to every right branch (or edge) in the tree.
The Huffman code for each character is the sequence of 0s and 1s encountered on the path from the root to that character’s leaf node.
Continuing our example (this depends on how we arranged left/right children, let’s assume one way):
A possible tree structure and codes:
(11)
/ \
A(5) (6)
/ \
(4) R(2)
/ \
B(2) (2)
/ \
C(1) D(1)
If we assign ‘0’ for left and ‘1’ for right:
A:
0
R:
11
B:
100
C:
1010
D:
1011
Notice how ‘A’ (most frequent) has the shortest code, and ‘C’ and ‘D’ (least frequent) have the longest codes. This is the magic of Huffman coding! Also, importantly, no code is a prefix of another code (e.g., ‘0’ is A’s code, no other code starts with ‘0’). This “prefix property” is crucial for decoding.
Encode the Data: Replace each character in the original data with its newly generated Huffman code.
“ABRACADABRA” becomes:
0 100 11 0 1010 0 1011 0 100 11 0
How Does It Work? The Decoding Process#
Decoding is relatively straightforward if you have the Huffman Tree (or the code table).
Obtain the Huffman Tree/Codes: The decoder needs the same Huffman Tree (or the mapping of codes to characters) that was used for encoding. This tree/table is often transmitted along with the compressed data.
Read the Compressed Bit Stream: Start reading the encoded data bit by bit.
Traverse the Tree:
Begin at the root of the Huffman Tree.
For each bit read from the compressed stream:
If the bit is ‘0’, move to the left child in the tree.
If the bit is ‘1’, move to the right child in the tree.
When you reach a leaf node, you have decoded one character. Record that character.
Return to the root of the tree and repeat the process with the next bit in the compressed stream until all bits are processed.
Example of decoding 010011...
using our codes:
Read
0
: Path is root -> left. It’s ‘A’. Output ‘A’. Back to root.Read
1
: Path is root -> right. Not a leaf.Read
0
: Path is current node -> left. Not a leaf.Read
0
: Path is current node -> left. It’s ‘B’. Output ‘B’. Back to root.Read
1
: Path is root -> right. Not a leaf.Read
1
: Path is current node -> right. It’s ‘R’. Output ‘R’. Back to root. And so on.
Because of the prefix property (no code is a prefix of another), there’s no ambiguity in decoding. As soon as you reach a leaf, you know you’ve completed a character.
Where is Huffman Coding Used?#
Huffman coding, either in its pure form or as a component of other algorithms, is used in a wide variety of applications.12 Here are some prominent examples:
File Compression:
ZIP files: Classic .zip files often use a method called “Deflate,” which combines LZ77 (another compression algorithm) with Huffman coding.13
GZIP files: Used extensively on Unix-like systems and for web content, GZIP also uses Deflate.14
Image Compression:
JPEG: While JPEG primarily uses a “lossy” compression (Discrete Cosine Transform), Huffman coding is often used as one of the final, lossless steps to compress the resulting data coefficients.
PNG: This is a lossless image format, and it uses the Deflate algorithm (which includes Huffman coding) for compression.
Multimedia Formats:
MP3 (Audio): Huffman coding is used as part of the MP3 compression process to reduce the size of audio data.15
MPEG (Video): Similar to JPEGs and MP3s, various MPEG video formats (like MPEG-2 used in DVDs, MPEG-4 used in many online videos) incorporate Huffman coding for entropy encoding parts of the video and audio data.
Communication Protocols:
HTTP/2 and HTTP/3 (HPACK and QPACK): When your browser communicates with a web server using newer HTTP versions, header compression techniques like HPACK (for HTTP/2) and QPACK (for HTTP/3) use Huffman coding to reduce the size of HTTP headers, making web pages load faster.16
Fax machines also historically used forms of Huffman coding.17
Text and Data Archiving: Anytime there’s a need to store or transmit text or other data more efficiently without losing information, Huffman coding or algorithms incorporating it can be considered.
In summary, the Huffman Algorithm is a foundational technique in data compression. By assigning shorter codes to more frequent symbols and longer codes to less frequent ones, it efficiently reduces the overall size of data for storage or transmission, all while ensuring the original data can be perfectly reconstructed.18
Example Code#
package huffman_code;
import java.util.*;
// Node class for Huffman tree
class Node {
char ch;
int freq;
Node left, right;
Node(char ch, int freq)
{
this.ch = ch;
this.freq = freq;
}
Node(int freq, Node left, Node right)
{
this.ch = '\0'; // internal node
this.freq = freq;
this.left = left;
this.right = right;
}
boolean isLeaf()
{
return left == null && right == null;
}
}
class HuffmanCode {
private Node root;
private Map<Character, String> huffmanCode;
public HuffmanCode()
{
}
public void runHuffaman(String text)
{
Map<Character, Integer> freqMap = buildFrequencyMap(text);
this.root = buildTree(freqMap);
this.huffmanCode = new HashMap<>();
buildCodes(root, "", huffmanCode);
}
private Map<Character, Integer> buildFrequencyMap(String text)
{
Map<Character, Integer> freqMap = new HashMap<>();
for (char ch : text.toCharArray()) {
freqMap.put(ch, freqMap.getOrDefault(ch, 0) + 1);
}
return freqMap;
}
private Node buildTree(Map<Character, Integer> freqMap)
{
PriorityQueue<Node> pq = new PriorityQueue<>(Comparator.comparingInt(n -> n.freq));
for (Map.Entry<Character, Integer> entry : freqMap.entrySet()) {
pq.add(new Node(entry.getKey(), entry.getValue()));
}
while (pq.size() > 1) {
Node left = pq.poll();
Node right = pq.poll();
pq.add(new Node(left.freq + right.freq, left, right));
}
return pq.poll(); // root
}
private void buildCodes(Node node, String code, Map<Character, String> map)
{
if (node == null) {
return;
}
if (node.isLeaf()) {
map.put(node.ch, code);
}
buildCodes(node.left, code + "0", map);
buildCodes(node.right, code + "1", map);
}
public String encode(String text)
{
StringBuilder encoded = new StringBuilder();
for (char ch : text.toCharArray()) {
encoded.append(huffmanCode.get(ch));
}
return encoded.toString();
}
public String decode(String encodedText)
{
StringBuilder decoded = new StringBuilder();
Node current = root;
for (char bit : encodedText.toCharArray()) {
current = (bit == '0') ? current.left : current.right;
if (current.isLeaf()) {
decoded.append(current.ch);
current = root;
}
}
return decoded.toString();
}
public Map<Character, String> getHuffmanCodes()
{
return this.huffmanCode;
}
public String stringToBinary(String text)
{
String newString = "";
StringBuilder sb;
for (int i = 0; i < text.length(); i++) {
String myBinary = "";
int asciVal = Integer.valueOf(text.charAt(i));
while (asciVal > 0) {
myBinary = myBinary + (asciVal % 2 == 1 ? "1" : "0");
asciVal /= 2;
}
sb = new StringBuilder(myBinary).reverse();
newString += sb + "";
}
return newString;
}
}
public class Huffman_Code {
public static void main(String[] args)
{
String quit = "n";
Scanner myScan = new Scanner(System.in);
String text = "huffman";
HuffmanCode huffman = new HuffmanCode();
System.out.println("Humman Code");
while (quit.equals("n")) {
System.out.print("Please enter a word or string: ");
text = myScan.nextLine();
huffman.runHuffaman(text);
// Print codes
System.out.println(text);
for (var entry : huffman.getHuffmanCodes().entrySet()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
// Encode
String encoded = huffman.encode(text);
System.out.println("\nEncoded Text : " + encoded);
System.out.println("Binary : " + huffman.stringToBinary(text));
// Decode
String decoded = huffman.decode(encoded);
System.out.println("\nDecoded Text:\n" + decoded);
System.out.print("Would you like to quit: y/n: ");
quit = myScan.nextLine();
}
System.out.println("\nbye\n");
}
}
/*
Please enter a word or string: Mississippi
Mississippi
p: 101
s: 0
i: 11
M: 100
Encoded Text : 100110011001110110111
Binary : 10011011101001111001111100111101001111001111100111101001111000011100001101001
Decoded Text:
Mississippi
Would you like to quit: y/n: n
Please enter a word or string: Bubba blows bubbles
Bubba blows bubbles
: 010
a: 0000
B: 10111
b: 11
s: 011
u: 100
e: 1010
w: 0001
l: 001
o: 10110
Encoded Text : 1011110011110000010110011011000010110101110011110011010011
Binary : 10000101110101110001011000101100001100000110001011011001101111111011111100111000001100010111010111000101100010110110011001011110011
Decoded Text:
Bubba blows bubbles
Would you like to quit: y/n: y
bye
*/
Code Explanation#
We have two main pieces here: a Node
class and a HuffmanCode
class.
1. The Node
Class: The Building Blocks of Our Tree#
Java
// Node class for Huffman tree
class Node {
char ch; // The character this node might represent (if it's a leaf)
int freq; // How often this character appears, or sum of frequencies of children
Node left, right; // The children of this node (null if it's a leaf)
// Constructor for LEAF nodes (nodes that represent actual characters)
Node(char ch, int freq)
{
this.ch = ch;
this.freq = freq;
// left and right will be null by default for a new leaf node
}
// Constructor for INTERNAL nodes (nodes that are created by combining two other nodes)
Node(int freq, Node left, Node right)
{
this.ch = '\0'; // A null character, indicating it's not a character leaf
this.freq = freq;
this.left = left;
this.right = right;
}
// Helper method to check if this node is a leaf node
boolean isLeaf()
{
return left == null && right == null; // A leaf has no children
}
}
Think of the Node
class as the blueprint for every single point or junction in our Huffman Tree.
char ch;
: This variable holds the actual character (like ‘A’, ‘B’, ‘C’) if this node is a leaf node – meaning it’s at the very bottom of a branch and represents a character from our input text.int freq;
: This stores the frequency.
If it’s a leaf node,
freq
is how many times its character (ch
) appears in the text.If it’s an internal node (a connection point within the tree, not a character itself),
freq
is the sum of the frequencies of all the leaves below it.
Node left, right;
: These are references to otherNode
objects. They are like the branches of the tree.left
points to the child node on the left, andright
points to the child node on the right. If a node is a leaf, bothleft
andright
will benull
(meaning they don’t point to anything).
Constructors (the Node(…) methods):
These are special methods used to create new Node objects.
Node(char ch, int freq)
: This is used to create a leaf node. You give it a character and its frequency.Node(int freq, Node left, Node right)
: This is used to create an internal node. You give it the combined frequency of its children, and references to what itsleft
child andright
child should be. We use a special character'\0'
(the null character) forch
just to signify that this internal node doesn’t represent a specific character itself.
isLeaf()
method:
boolean isLeaf()
: This is a handy little helper. It returnstrue
if the node has no left or right children (meaning it’s a leaf), andfalse
otherwise.
2. The HuffmanCode
Class: The Brains of the Operation#
Java
class HuffmanCode {
private Node root; // This will store the very top node (the root) of our Huffman tree
private Map<Character, String> huffmanCode; // This will store the generated codes (e.g., 'A' -> "01")
// Constructor (currently empty, could be used for setup if needed)
public HuffmanCode()
{
}
// Main method to orchestrate the Huffman encoding process
public void runHuffaman(String text)
{
// Step 1: Count character frequencies
Map<Character, Integer> freqMap = buildFrequencyMap(text);
// Step 2: Build the Huffman Tree using these frequencies
this.root = buildTree(freqMap);
// Step 3: Generate the codes (like "0" or "101") for each character from the tree
this.huffmanCode = new HashMap<>(); // Initialize the map to store codes
buildCodes(root, "", huffmanCode); // Start building codes from the root
}
The HuffmanCode
class contains all the logic to take a piece of text, build the tree, generate the codes, and then encode or decode text.
private Node root;
: This variable will eventually hold the singleNode
that is at the very top of our completed Huffman tree. It’s our entry point into the tree.private Map<Character, String> huffmanCode;
: AMap
is like a dictionary. This one will store each character from our text as a “key” and its corresponding Huffman code (a string of ‘0’s and ‘1’s) as its “value”. For example,{'A': "0", 'B': "100", ...}
.
Let’s look at its key methods:
public void runHuffaman(String text):
This is like the main manager for the Huffman process.
It first calls
buildFrequencyMap(text)
to count how many times each character appears in the inputtext
.Then, it calls
buildTree(freqMap)
using those frequencies to construct the actual Huffman tree. The result (the root of the tree) is stored inthis.root
.Finally, it initializes
this.huffmanCode
as a new empty map and callsbuildCodes(root, "", huffmanCode)
to traverse the tree and fill up thehuffmanCode
map with the binary codes for each character.
private Map<Character, Integer> buildFrequencyMap(String text)
:
Java
private Map<Character, Integer> buildFrequencyMap(String text)
{
Map<Character, Integer> freqMap = new HashMap<>(); // Create an empty dictionary
for (char ch : text.toCharArray()) { // Loop through each character in the input text
// Get the current count of 'ch', or 0 if not seen yet, then add 1
freqMap.put(ch, freqMap.getOrDefault(ch, 0) + 1);
}
return freqMap; // Return the map with all character counts
}
This method does exactly what we described as the first step of the Huffman algorithm:
It creates an empty
HashMap
calledfreqMap
.It then loops through every single character (
ch
) in the inputtext
.For each character, it updates its count in
freqMap
. ThegetOrDefault(ch, 0)
part is clever: it tries to get the current count ofch
. Ifch
hasn’t been seen before, it defaults to0
, and then1
is added.It returns this populated
freqMap
. Iftext
was “ABCA”,freqMap
would look like{'A': 2, 'B': 1, 'C': 1}
.
private Node buildTree(Map<Character, Integer> freqMap)
:
Java
private Node buildTree(Map<Character, Integer> freqMap)
{
// Create a priority queue. Nodes with lower frequency have higher priority.
PriorityQueue<Node> pq = new PriorityQueue<>(Comparator.comparingInt(n -> n.freq));
// Create a leaf node for each character and add it to the priority queue.
for (Map.Entry<Character, Integer> entry : freqMap.entrySet()) {
pq.add(new Node(entry.getKey(), entry.getValue()));
}
// Loop as long as there is more than one node in the queue
while (pq.size() > 1) {
// Remove the two nodes with the smallest frequency
Node left = pq.poll(); // Smallest
Node right = pq.poll(); // Second smallest
// Create a new internal node with these two nodes as children
// and frequency equal to the sum of the two nodes' frequencies.
// Add the new node to the priority queue.
pq.add(new Node(left.freq + right.freq, left, right));
}
// The remaining node is the root of the Huffman Tree
return pq.poll();
}
This is where the Huffman tree is actually built, following the “greedy” approach:
PriorityQueue<Node> pq = new PriorityQueue<>(Comparator.comparingInt(n -> n.freq));
: APriorityQueue
is a special kind of list where items are automatically kept in a certain order based on their “priority.” Here, we’re telling it to prioritizeNode
objects based on theirfreq
attribute – specifically, nodes with lower frequencies will be considered higher priority (meaning they come out first).for (Map.Entry<Character, Integer> entry : freqMap.entrySet()) { ... }
: It loops through ourfreqMap
. For each character and its frequency, it creates a new leafNode
(using the firstNode
constructor) and adds it to thepq
. So now,pq
is full of leaf nodes, ordered by their frequency.while (pq.size() > 1) { ... }
: This loop is the heart of the tree construction. It keeps running as long as there’s more than one node in the
pq
.
Node left = pq.poll();
andNode right = pq.poll();
: It takes out the two nodes with the lowest frequencies from thepq
. (Remember, lowest frequency = highest priority).pq.add(new Node(left.freq + right.freq, left, right));
: It creates a
new internal
Node
.
Its frequency is the sum of the frequencies of
left
andright
.left
becomes its left child, andright
becomes its right child.This new internal node is then added back into the
pq
. It will find its correct place in the priority order based on its new (combined) frequency.
return pq.poll();
: When the loop finishes, there’s only one node left in thepq
. This single node is the root of the entire Huffman Tree.
private void buildCodes(Node node, String code, Map<Character, String> map)
:
Java
private void buildCodes(Node node, String code, Map<Character, String> map)
{
if (node == null) { // Base case: if we've gone past a leaf
return;
}
// If this is a leaf node, we've found a character!
// Store the character and its accumulated code in our map.
if (node.isLeaf()) {
map.put(node.ch, code);
}
// Recursively call for the left child, appending "0" to the code
buildCodes(node.left, code + "0", map);
// Recursively call for the right child, appending "1" to the code
buildCodes(node.right, code + "1", map);
}
This method figures out the ‘0’s and ‘1’s for each character. It’s a recursive method, meaning it calls itself to solve smaller parts of the problem.
if (node == null) { return; }
: This is a safety check or a “base case” for the recursion. If we try to go to a child that doesn’t exist, we just stop that path.if (node.isLeaf()) { map.put(node.ch, code); }
: If the currentnode
we’re looking at is a leaf node (it has a character!), we’ve found the end of a path from the root. Thecode
string that we’ve built up by travelling down the tree is the Huffman code fornode.ch
. So, we store this pair in ourmap
(which ishuffmanCode
from therunHuffman
method).buildCodes(node.left, code + "0", map);
: This is the recursive step. It callsbuildCodes
again, but this time for theleft
child of the current node. Crucially, it appends a"0"
to thecode
string because we’re taking a left branch.buildCodes(node.right, code + "1", map);
: Similarly, it callsbuildCodes
for theright
child, appending a"1"
to thecode
string for taking a right branch.
Imagine starting at the root
with an empty code
string (""
). If you go left, the code becomes “0”. If you go left again, it’s “00”. If you then reach a leaf for character ‘X’, ‘X’ gets the code “00”. The method backtracks and explores all paths.
public String encode(String text)
:
Java
public String encode(String text)
{
StringBuilder encoded = new StringBuilder();
for (char ch : text.toCharArray()) { // Loop through each character of the input text
encoded.append(huffmanCode.get(ch)); // Look up its Huffman code and add it
}
return encoded.toString(); // Return the full string of 0s and 1s
}
Once runHuffman
has been called (so the huffmanCode
map is filled), this method takes the original text
and converts it into the compressed string of 0s and 1s.
It iterates through each character (
ch
) of the inputtext
.For each
ch
, it looks up its corresponding Huffman code in thehuffmanCode
map (e.g., ifch
is ‘A’, it might get “0”).It appends this code to a
StringBuilder
(which is an efficient way to build strings).Finally, it returns the complete encoded string.
public String decode(String encodedText)
:
Java
public String decode(String encodedText)
{
StringBuilder decoded = new StringBuilder();
Node current = root; // Start at the root of our Huffman tree
for (char bit : encodedText.toCharArray()) { // Loop through each '0' or '1' in the encoded text
// If the bit is '0', move to the left child. Otherwise, move to the right child.
current = (bit == '0') ? current.left : current.right;
// If we've reached a leaf node, we've decoded a character!
if (current.isLeaf()) {
decoded.append(current.ch); // Add the character to our result
current = root; // Go back to the root for the next character
}
}
return decoded.toString(); // Return the fully decoded text
}
This method takes a compressed string of 0s and 1s (encodedText
) and converts it back to the original text.
It starts with
current
pointing to theroot
of our Huffman tree.It reads the
encodedText
onebit
(‘0’ or ‘1’) at a time.If the
bit
is ‘0’, it movescurrent
to itsleft
child.If the
bit
is ‘1’, it movescurrent
to itsright
child.
if (current.isLeaf()) { ... }
: After moving, it checks if
current
is now a leaf node.
If it is, it means we’ve successfully traced a complete Huffman code. The character at this leaf (
current.ch
) is appended to ourdecoded
result.Importantly,
current
is then reset back toroot
to start searching for the next character from the current position in theencodedText
.
It returns the fully
decoded
string.
public Map<Character, String> getHuffmanCodes()
:
Java
public Map<Character, String> getHuffmanCodes()
{
return this.huffmanCode; // Simply returns the map of characters to their codes
}
This is a simple “getter” method. It just returns the huffmanCode
map that was generated, in case you want to inspect the codes outside of this class.
public String stringToBinary(String text)
:
Java
public String stringToBinary(String text)
{
String newString = "";
StringBuilder sb;
for (int i = 0; i < text.length(); i++) {
String myBinary = "";
int asciVal = Integer.valueOf(text.charAt(i)); // Get ASCII value of character
// Convert ASCII value to its binary representation
while (asciVal > 0) {
myBinary = myBinary + (asciVal % 2 == 1 ? "1" : "0");
asciVal /= 2;
}
// The above loop generates binary in reverse, so reverse it
sb = new StringBuilder(myBinary).reverse();
newString += sb + ""; // Append to the result
}
return newString;
}
}
This method is a bit different from the core Huffman logic. It appears to be a utility function to convert a given string into its standard fixed-length binary representation (likely 7 or 8-bit ASCII, padded implicitly).
It iterates through each character of the input
text
.int asciVal = Integer.valueOf(text.charAt(i));
gets the numerical ASCII value of the character (e.g., ‘A’ is 65).The
while (asciVal > 0)
loop converts this decimal ASCII value into a binary string. It does this by repeatedly taking the number modulo 2 (to get the last bit) and then dividing by 2.sb = new StringBuilder(myBinary).reverse();
is needed because the loop builds the binary string in reverse order.It concatenates these binary representations for all characters.
Important Note on stringToBinary
: This method is not producing Huffman codes. Huffman codes are variable-length and based on frequency. stringToBinary
produces fixed-length codes (like standard ASCII-to-binary). It might be included in this class for comparison purposes – to show how much space Huffman coding saves compared to a standard binary representation of text. For example, you could encode “AAAAA” using Huffman (might be “00000”) and then using stringToBinary
(might be “0100000101000001010000010100000101000001”) to see the difference.
So, the main parts for the Huffman algorithm itself are runHuffman
(and the methods it calls: buildFrequencyMap
, buildTree
, buildCodes
), encode
, and decode
. The Node
class is the essential data structure it uses.
Hopefully, this breaks down the code in a way that makes sense! The key is to see how these different code pieces work together to implement the steps of the Huffman algorithm we discussed earlier.