Security consultant Mark Burnett has just published 10 million passwords along with their corresponding usernames. It’s a thoughtful offering to other researchers — but a legally risky move given the current legal situation surrounding hacking.
Usually, passwords are released alone to researchers, but that prevents them from analysing how username and password might go together. Burnett has explained that he has “wanted to provide a clean set of data to share with the world” for quite some time to provide both, together, as it gives “great insight into user behaviour and is valuable for furthering password security.”
So, he has. Taking a random sample of passwords from dumps already dotted around elsewhere on the internet — on sites like haveibeenpwned and pwnedlist — he’s combined them into a single, handy package.
But he’s done so with some trepidation and much justification (and, before you panic too much, he believes most of them are now dead). “I think this is completely absurd that I have to write an entire article justifying the release of this data out of fear of prosecution or legal harassment,” he writes on his blog. “I had wanted to write an article about the data itself but I will have to do that later because I had to write this lame thing trying to convince the FBI not to raid me.”
Although researchers typically only release passwords, I am releasing usernames with the passwords. Analysis of usernames with passwords is an area that has been greatly neglected and can provide as much insight as studying passwords alone. Most researchers are afraid to publish usernames and passwords together because combined they become an authentication feature. If simply linking to already released authentication features in a private IRC channel was considered trafficking, surely the FBI would consider releasing the actual data to the public a crime…
In the case of me releasing usernames and passwords, the intent here is certainly not to defraud, facilitate unauthorised access to a computer system, steal the identity of others, to aid any crime or to harm any individual or entity. The sole intent is to further research with the goal of making authentication more secure and therefore protect from fraud and unauthorised access…
Furthermore, I believe these are primarily dead passwords, which cannot be defined as authentication features because dead passwords will not allow you to authenticate. The likelihood of any authentication information included still being valid is low and therefore this data is largely useless for illegal purposes. ..
Ultimately, to the best of my knowledge these passwords are no longer be valid and I have taken extraordinary measures to make this data ineffective in targeting particular users or organisations. This data is extremely valuable for academic and research purposes and for furthering authentication security and this is why I have released it to the public domain.
With all that in mind, Burnett has taken a random sample of 10 million passwords, gathered from “thousands of dumps consisting of upwards to a billion passwords.” If your password isn’t on the list, that doesn’t mean it’s not floating around on the internet somewhere; merely that it’s not on this list. We’re not linking to the download, but it shouldn’t be too hard for you to work out where to look… [Mike Burnett]