Posted by Jianhua Li on GitHub. This was proposed as a data science project on Data Science Central, to challenge your data science skills on a real data set. Below is an overview.
Basically one should try to answer the following three questions:
Data is available here
Step 1. Load Packages
Step 2. Load Data
Server: nginx Date: Tue, 22 Nov 2016 00:39:04 GMT Content-Type: text/plain; charset=utf-8
Last-Modified: Sun, 27 Mar 2016 05:04:06 GMT
Expires: Tue, 29 Nov 2016 00:39:04 GMT
# This is a list of 2,151,220 unique ASCII passwords in sorted order according
# to their native byte values using UNIX sort command.
# This list (also known as wordlist, password dictionary or password list)
# is useful for password recovery tools such as John the Ripper, oclHashcat
# and Aircrack-ng. To use this file, be sure to first remove these comment
# lines, i.e. the lines starting with # character.
# If you are looking for a better password dictionary,
# see http://dazzlepod.com/uniqpass/
# $DateTime: 2016/03/27 16:04:06 $
# Comments/Questions? Send to [email protected]
What you will find in this article, besides the first two steps :
Step 3. Explore the Data
Step 4. Data Preprocess
Step 5. Analyze the Data
Step 6. Classification
The picture below is from the original (long) article.
To read the original article with source code, analysis and conclusions, click here.
Top DSC Resources