If there’s one thing that really annoy me, it’s when I forget my password at some random website, ask for a password reminder and get my old password sent back to me in clear text in an email. This shows so many levels of ignorance in the people that developed the system, that I immediately feel like deleting my profile and never come back.
I started thinking about this again today when I read Sony’s announcement about the PlayStation Network compromise, and got a bit surprised when they also listed passwords as one of the pieces of information that might have been stolen. Surely Sony can’t be that unprofessional, storing passwords in clear text!?
I’m not sure we’ll ever figure that out, but anyway, if you’re ever finding yourself developing a system that needs to store passwords, please continue reading.
So what’s the deal with storing passwords as clear text? Well, first of all, it’s highly insecure. There’s generally no need to store a password as clear text.
Wondering how you’re supposed to check if the user enters the right password if you can’t store it as clear text? That’s when hashing enters the picture.
Simply put, a hash function takes input data and outputs a new value based on this input data. It’s stable, so every time you give it the same input data, you get the same output data. Typically the output data is of fixed length, no matter how much or little input data you give it. One typical example usage is to test the integrity of a file you downloaded. Say the file was downloaded from a mirror site; You can’t be 100% sure no one has changed this file while it was stored at the mirror site. Having the hash of the file content, provided that this hash value is given to you by someone you trust (like the source site), you can now compare this hash value with the hash value of your local file.
So how does this relate to passwords? Well, what you should do, instead of storing passwords as clear text, is to hash them using something like SHA-1. Then the next time your user logs in, you rehash the newly entered password and check if it’s equal to the stored hash.
Put some salt on it!
But, please, don’t stop here. Hashing alone isn’t enough! There’s something called rainbow tables. This is basically a huge registry of all the available hash values and their clear text equivalent. As computing power and storage space continues to fall in price, it’s become trivial to gain access to these types of utilities for cracking hashed passwords.
So how do you fix that? It’s all about increasing the cost of cracking your users password. As we can’t trust users to pick a good password, the first thing you should do is to enforce some minimum password standard, say, lower and upper case letters, one number, minimum length, etc.
Secondly, you should salt you input. That means, in addition to hashing the password you should add something else to the password. There are many ways to pick this additional part, but it needs to be something you know the next time you need to hash the same password. It could be a fixed value you store in your code, a configuration file, or a database. The downside of this is that if someone steals your password hashes, they might also get the salt value. Still, this makes it a bit more costly to crack, as they would basically need to create a new rainbow table to look for weak passwords. However, in addition to this salt value, you could further increase the cost of cracking a password by adding something unique to each users hashed password. That could be their user ID, login name or email.
Then you would, say, have the following input to your hashing function: userID + email + secret salt value + user password => hashed password, which might look like: 33cde15ec0621256153199ccab601e7d320195bf
You could take it further, but using this is way better than storing passwords in clear text. And it’s trivial to accomplish in all major relevant programming languages.
PS: And to make it clear, as some of the comments point out: Whenever you deal with cryptography and random numbers it’s usually better to rely on ready-made libraries of good quality.