In this post, I'll explore the new AWS Machine Learning services.
The problem we are trying to solve is to classify auto accident severity given a set of features. I'll not go into further details of the data set and what classification algorithms,etc. here since the goal of this blog is to explore the new AWS Machine Learning service step by step.
In the next blog post, I'll explore another service: Microsoft Azure Machine Learning.
Let's get started by logging into the AWS Console.
Now select Machine Learning service:
The open screen comes up. Select "Get Started"
Let's click on Explore model performance to see the details. It looks too good to be true.
Oh, wow! Wait a minute... Something is amiss!
The model has a 100% classification accuracy across all three different types of accident severity types?! Something is wrong. For more details on how to read and interpret the matrix above, check out this documentation here.
It was fun to experiment with the new waves of Machine Learning services. As a data scientist, I still prefer the powerful language R so I know exactly what I put in the models, tune it, and understand its outputs. Yes, these GUI-based machine learning services can be easier for the novices, but it's not obvious if it does exactly what one wants to do and if it's flexible enough for fine tuning. Perhaps, I need to spend more time on the documentations. This is just first impressions. I'm sure these things will improve over time.
Additionally, it takes what seems like a VERY LONG time to process a relatively small data file. We are talking about 43K rows of data. R can rip through that thing very quickly, but I was waiting like 15-20 minutes for the entire sequence to process on AWS Machine Learning.
So, the use case for AWS Machine Learning ONLY makes sense if one has REALLY large scale data that you need the cloud computing infrastructure. Otherwise, it's really slow. It's like using Hadoop to process 1MB of data. Not a good use case. :)
For a professional data scientist, I find this canned service rather limiting and does not offer the full flexibility of a true data science computing environment. To be fair, I'm sure it will improve.
Well, that's all for now folks.
Next time, I'll explore another Machine Learning services using the new Microsoft Azure Machine Learning.
Originally posted here.
DSC Resources
Additional Reading
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge
Comment
I felt the same - its a good start but I'd like to see ensemble method support as well as a number of other algorithms (decision trees, NB, SVMs etc).
However if you delve into the developer guide you'll quite a bit of support for feature transformations including n-gram generation etc. You can do quite a bit with your own recipes.
I'm sure a lot of updates are coming though - talking to the AWS guys I know, this is only the start of a big play in the ML space.
Most enlightening, thanks Peter.
I agree that for me support for reproducible research is pretty critical. Difficult with point and click tools.
Best wishes, Ron
We welcome you to take a look at ForecastThis machine learning platform. We're not one of the 'big boys', this platform is fully agnostic and independent and accesses a library of hundreds of algorithms, performing thousands of model tests in minutes.
© 2019 Data Science Central ®
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central