Paper Review: An Administrator's Guide to Internet Password Research

A few weeks ago I had the pleasure of reading An Administrator's Guide to Internet Password Research (Dinei Florencio, Cormac Herley, and Paul C. van Oorschot, Microsoft Research, November 2014) in detail. It took many hours because I wanted to understand everything.

The paper is of high quality and extremely well-written and researched. The main takeaway is that attacks on passwords can be differentiated between two very distinct categories: Online and offline attacks. The demands on password strength are much lower if the password only has to withstand online attacks because the rate of guesses of an adversary is relatively low, especially so if the administrator employs means of rate limiting tries, lockout policies or CAPTCHAs. The authors estimate 10^6 attacks on a single account over a period of 4 months. However, if the adversary gains access to the password database and the passwords are properly hashed and salted (and maybe peppered, I was very surprised that the word pepper never came up in the paper, but they did mention site-wide (global) salt which is another name for it), then he can launch an offline brute-force attack that can reach, depending on the resources of the adversary, up to 10^20 guesses over a 4-moth period if they use roughly a thousand modern GPUs. I'd say that 10^20 is quite large. If an adversary is willing and capable of using such powerful hardware for 4 months, this means that your account has such a high value that you should have better security measures in place than just hashed and salted passwords.

The authors tried to describe Rainbow Tables in two sentences. Needless to say, that was not enough for me to understand them, even though I understand hashing and its properties. So I had to search for and read other resources until I gained a proper understanding of them. I would have preferred if the paper did not try to explain the concept and introduce the term 'chains' that is never used again.

In the section about phishing, the paper mentions that the March 2011 breach on RSA Security's SecurID was reportedly a spear-phishing attack. What happened was that the attackers sent E-mails with an attached spreadsheet called "2011 Recruitment plan.xls". The spreadsheet contained a zero-day exploit that installs a backdoor through an Adobe Flash vulnerability (CVE-2011-0609) [0]. I was extremely surprised that this attack is labelled as phishing in the media - at no point in time was the user tricked into giving up sensitive information and credentials. He simply opened tabular data in an E-mail - the idea that Microsoft Excel embeds Adobe Flash which contains a zero-day exploit is mind-boggling. I have never seen a proper spreadsheet make use of embedded Flash content. This is probably not the usual way of how phishing happens.

There's a typo on page 15: It can facilitate forms of phishing if users become habituated to entering their email passwords at low-value sites that users email addresses as usernames.

On Page 16: While naive "password strength" measures are widely used, simple to calculate, and have formed the basis for much of the analysis around passwords, simplistic metrics [13] based on Shannon entropy are poor measures of guessing-resistance (recall Section 3.1).
I think at this point it is worth pointing out more clearly that Shannon entropy is not the problem, it is completely fine and the right tool for the job. The problem is the (completely false and naive) assumption that every possible password in the password-space has an equal likelihood of being chosen. Breached databases show this plainly. As is evident from the formula of Shannon entropy, we have to know the probability distribution to calculate the entropy. Without mentioning it in those words, the authors also used Shannon-entropy in the paper, but they relied on the data of leaked password databases to estimate the probability distribution and thus the entropy, which is the proper way to do it. That is, they appreciated the fact that the password 'password' has a much higher probability than 'FnhDgnWm'. Determining the distribution of passwords is actually a very complex subject because it has to capture the element of the human mind, how it makes up and associates things to form a password. Suddenly, things like cultural background play a role. Before Justin Bieber was famous, it is unlikely that anyone used 'justinbieber' as their password.

All in all, this paper is a must-read for every administrator and people who manage portfolios of passwords, but it is also good to know as a plain user of password-secured services.


Post comment

* required field


Andreea wrote on 20 May 2016
๐Ÿ‘Œ๐Ÿ‘€๐Ÿ‘Œ๐Ÿ‘€๐Ÿ‘Œ๐Ÿ‘€๐Ÿ‘Œ๐Ÿ‘€๐Ÿ‘Œ๐Ÿ‘€ good shit goเฑฆิ sHit๐Ÿ‘Œ thats โœ” some good๐Ÿ‘Œ๐Ÿ‘Œshit right๐Ÿ‘Œ๐Ÿ‘Œthere๐Ÿ‘Œ๐Ÿ‘Œ๐Ÿ‘Œ rightโœ”there โœ”โœ”if i do ฦฝaาฏ so my self ๐Ÿ’ฏ i say so ๐Ÿ’ฏ thats what im talking about right there right there (chorus: สณแถฆแตสฐแต— แต—สฐแต‰สณแต‰) mMMMMแŽทะœ๐Ÿ’ฏ ๐Ÿ‘Œ๐Ÿ‘Œ ๐Ÿ‘ŒะO0ะžเฌ OOOOOะžเฌ เฌ Ooooแต’แต’แต’แต’แต’แต’แต’แต’แต’๐Ÿ‘Œ ๐Ÿ‘Œ๐Ÿ‘Œ ๐Ÿ‘Œ ๐Ÿ’ฏ ๐Ÿ‘Œ ๐Ÿ‘€ ๐Ÿ‘€ ๐Ÿ‘€ ๐Ÿ‘Œ๐Ÿ‘ŒGood shit