Here's a password data set (20 MB) with 2 million entries, from dazzlepond.com. I discovered this Malaysian website when investigating new subscriber email addresses on Analyticbridge (to decide whether they were associated with spam or other malicious activity). This Malaysian website also claims to have the full list of 450,000 Yahoo email accounts that were recently hijacked - you can indeed download all these email addresses from their website (and possibly check whether hijacked email addresses share patterns that make them vulnerable).
Anyway, the reason for sharing the password data set with you is for you to test your data science skills: try to answer the following questions:
Other data set of interest: Official salary of 30,000 University of Washington employees
Thank you for sharing!
Here's what I found: http://parasdoshi.com/2012/08/14/what-can-a-dataset-of-hacked-passw...