Hashing… First Step to Secure Software Programming

Image for post
Image for post
Cartoon by Sergey Gordeev from Gordeev Animation Graphics, Prague.

Where it all started?

Authentication

Authentication methodologies

1. Username Password authentication

Image for post
Image for post
Fig 1 — Traditional Username, Password authentication method
Image for post
Image for post
Fig 2 — Db table to store username and password for traditional authentication methodology

2. Password Hashing

Image for post
Image for post
Fig 3 — Hashing the string “logIn123” using SHA-256 hash function

Despite the input length based on the hash function(SHA-1, SHA-256, MD2, MD5, etc) the output hashed value length will remain the same. In Fig 4 it is clearly shown that for different length input strings “a” and “abc” we get different hash value when using SHA-256. But the length (no of characters) of the hash value is same (64) for both the input strings. For the same input strings “a” and “abc” when SHA-1 was used to hash the values, different hash values were generated but the output length for both “a” and “abc” is the same(40).

If we take a look at the string “abc” that has been hashed by SHA-256 and SHA-1 we get 2 different hash values and 2 different output length. So this shows that for the same input value if we use different hashing functions we get different variable length digest. But when hashing different input values using the same hashing function we get different digest but the output digest length will be the same for all hashed values.

Image for post
Image for post
Fig 4 — Fixed length output for different hash functions

As a general rule of thumb the greater the bit length of the hash value, the greater the protection as the cryptanalysis work factor will significantly get greater. In general SHA series hash functions are used for higher protection as MD series have collision problems. If 2 different input values get the same hashed value it is called as the hash collision.

Hashing is “one-way” due to its irreversible character. Because of this property hashing cannot be used in places where you need to convert back the hashed value for future use.
E.g: Hashing the credit card numbers is not an advisable method. Once the input value is hashed, it cannot be converted back to its original form. So there is a high chance of losing sensitive data.

How does hashing authentication works?

Image for post
Image for post
Fig 5 — User authentication flow with hashing

The flow goes as follows:

1. Signup: User enrolls him/herself with the system by providing user credentials(username and password).

2. The password will be hashed using a hash function(here SHA1 has been used).

3. The hashed password along with its username will be sent to the database for storage.

4. Login: Once the user has successfully enrolled in the system the next time when the user wants to access the system, he/she has to provide their user credentials.

5. The password will be hashed using the same hash function used when the user got enrolled for the very first time.

6. Once the hashed value was generated for the user entered password, it will be sent to the server

7. The corresponding password for the username will be requested from the database.

8. If the user exists the corresponding password will be sent to the server.

9. The server will compare the 2 hash values for both the passwords and decide whether both the passwords are the same or not.

10. If the passwords are the same then the response will be sent to the client.

3. Salted Hashing

Image for post
Image for post
Fig 6 — Same hashed value password for two different users in database

This is when “Salted Hashing” comes into the play. Before analyzing about salted hashing, let’s take a look at the term “Salt”. Salt is nothing but a randomly generated string. In salted hashing, this randomly generated string will get appended to the password and then with the aid of a hash function the appended string will be converted to a digest. By this process, no user will have the same hashed password. Therefore, authentication will be more secure. In addition to that, some prefer to append the username with the password and salt, then hash it. It all depends on the requirements. Following are some of the common ways of doing salted hashing (Still there are debates on using double hashing and wacky hash functions claiming that they are not the correct way to implement hash functions).

Image for post
Image for post
Fig 7 — Common ways of implementing salted hash

Let’s try out salted hashing with a simple java application. Initially, the user should be registered to the system by providing user details(username and password). Since it’s for a demonstration purpose I have included the selection of hash function as well. But in a real scenario based on the requirement, the developers will decide on which hash function to be used. Currently, the recommended hash function is SHA-256.

Image for post
Image for post
Fig 8 — User signup page

Once the user has registered with the system, user credentials along with the randomly generated salt value will be stored in the database.

Image for post
Image for post
Fig 9 — Database table containing username, hashed password, salt and type of hash function

When the user login to the system the corresponding salt, hashed password along with the hash function type will be retrieved from the database based on the username (the username is unique). Then the user entered password will be hashed with the salt value retrieved from the database with the corresponding hash function. If the hashed password and the password retrieved from the database are same then the user can enter into the system as an authorized user. This how the salted hashing works.

Even though you cannot get the original value from the salted password, following the salted hashing technique for authentication prevent the users from dictionary attacks. So far we have seen hashing being used for authentication purposes. Is hashing limited only for authentication or are there any other places where hashing is used? Yes, there are some significant places where we use hashing for.

The usage of hashing in real world scenario

  1. Ensuring the integrity of messages during communication
Image for post
Image for post
Fig 10 — Message sent from A to B

The need for integrity verification of message during a communication:

  1. To verify that the message is from A (Verifying the sender)
  2. The message received by B is not altered by any “man in the middle” and it is the same message sent by A.

How does this verification takes place in a communication?

Image for post
Image for post
Fig 11 — Hashing for integrity verification

In the above scenario, A sends a message along with a hash value H1. This hash value was generated by sending that message through a hash function. When B receives the message along with the hash value sent by A, B will hash the message with the same hash function used by A and gets the hash value H2.

If H2 == H1,

  • The received message is same as the original sent by A (message was not modified)

If not,

This is how the message integrity is handled by hashing.

2. Hashing for indexing in database

Image for post
Image for post
Fig 12 — Hashing for indexing in database

During a search process, the input value will be hashed using a hash function. The hash value will be searched in the available buckets and through this, it speeds up the search.

I hope that this basic introduction to hashing will be helpful for the beginners. Try out these concepts and ideas by implementing some simple programs and get the practical experiences as well.

Originally published at saratechnobytes.blogspot.com on August 21, 2018.

Written by

Inquisitive

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store