If we stretch the term ‘hash’ thin enough we might be, just might be able to explain a litttttle bit of relevance between Hash algorithms and Hash Tags, but the bottom line is they are different things. So, in this posting, whenever there is the term ‘hash’ comes up, it simply means either hash, hashing or hash algorithms. So, what is the hash and why do we need it?
We all know the origin of the word “Eureka“. Archimedes was assigned to prove if the King’s goldsmith made a crown with the full amount of given gold, without skimming it. Because the crown looked like made in gold 100%, it was difficult for him to prove the genuinity till the legendary idea came up. After he jumped out of a bathtub shouting “Eureka”, ran down the street (which is a punishable act nowadays according to California penal code 314..) he created another golden crown and put each crown in a tub to check if the density and purity of each crown are matches because different density and purity will yield different amount of water spill. This is a good example to explain the meaning of hashing and data’s integrity.
The hash algorithms are widely used to guarantee the integrity of data. There are many threats possible during data transmissions; MITM attack attempts or data loss caused by unstable internet connections to name a few. Either way, when I send data, I expect the recipient receives the same data I sent, and I would love to confirm that somehow. And that’s when the hashing kicks in.
A hash value is a loooooong string consist of random looking, but formulated characters. For instance, The SHA256 hash value for the string “themitigators.com” is this;
And the beauty of the hash value is, if one bit changes from the original data, the entire hash value changes. As you can see in the following table, if I place a space between ‘the’ and ‘mitigators.com’, or remove ‘.’ in front of ‘com’ the entire hash value changes and each hash value itself has no relevance to others at all. The hashing is not an encryption proces, therefore,e hasing a file doesn’t mean the data is hidden. The hash value can be displayed in plaintext but because one bit of change generates an entirely different hash value it’s known as irreversible.
That’s why many freeware are distributed with their own hash value. If you trying to download, say, Kali Linux from a 3rd party’s web storage for some legitimate reasons, once the download is completed, before you take any further actions, you will need to check its hash value because maybe some malwares are embedded in the installation package and waiting for you to initiate the installation process. Thanks to the Hash algorithms, by simply comparing the hash value from the official website, if there has been a bit of installation package(e.g. ISO file) was compromised, you will know.
Let’s review what we just read by going back to the Archimedes story above. The crown that the goldsmith supposed to deliver(“data transmission”) without skimming is “original data”. The actual crown the goldsmith delivered after he skimmed is “altered, tampered data” and when Archimedes sunk each crown into the tub, the amount of water spill each crown yield is “hash value”. Since those two crowns had two different amount of water spills(“two different hash values”), Archimedes(“System”) and the King(“Recipient”) now know the first crown(“original data”) has been compromised. The goldsmith’s greed? Maybe it was “Man-in-the-middle attack” or “Unstable connections”.
Hope this makes sense to you.