If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:4:16

Video transcript

when we represent information such as an image digitally it means we must slice it up into tiny chunks this allows us to send an image as a sequence of color symbols and these colors can be represented as unique numbers using some code consider the following challenge Alice and Bob can transmit and receive messages in binary they charge their customers one penny per bit to use their system and a regular customer arrives who wants to send a message and their messages are 1,000 symbols long the meaning of the messages is completely unknown normally this is sent with a standard two-bit code which results in charging for 2,000 bits however Alice and Bob already did some analysis on this customer before and determined that the probability of each symbol in the message is different can they use these known probabilities to compress the transmission and increase their profits what's the optimal coding strategy David Huffman famously provided the optimal strategy which he published in 1952 and based on building a binary tree from the bottom up to begin we can list all symbols at the bottom which we can call nodes and then we find the two least probable nodes in this case B and C and merge them into one and add the probabilities together then we repeat with the next to least likely notes and continue merging until you have a single node at the top finally we label the edges in this tree with 0 or 1 in any order now the code for each letter is just the path from the top of the tree to the given letter so for a it's just one edge or one this is now known as a Huffman coding and for examples of the following type you cannot beat it go ahead and try for example if you shorten the code for D to just zero then a message 0 1 1 could perhaps mean deed a a or maybe just B so for this to work you'd need to introduce letter spaces which cancel out any savings during transmission now how far does this compress the message compared to the original 2,000 bits well we just need to calculate the number of bits per letter on average so we multiply the length of each code times the probability of occurence and add them together which results in an average length of 1.75 bits per symbol on average that means with this Huffman coding we can expect to compress the messages from 2,000 bits to 1,750 and Claude Shannon was the first to claim that the limit of compression will always be the entropy of the message source as the entropy or uncertainty of our source decreases due to known statistical structure the ability to compress increases while if the entropy increases due to unpredictability our ability to compress decreases if we want to compress beyond entropy we must necessarily throw away information in our messages