Raise your hand if your company is making more than 15!
1. Day-dreaming that analytics is a plug & play magic wand that will bring very short term ROI. Well…
Now published. Enterprise AI: An applications perspective takes a use case driven approach to understand the deployment of AI in the Enterprise. Designed for strategists and developers, the book provides a practical and straightforward roadmap based on application use cases for AI in Enterprises. The authors (Ajit Jaokar and Cheuk Ting Ho) are data scientists and AI researchers who have deployed AI applications for Enterprise domains. The book is used as a reference for Ajit and Cheuk's new…Continue
K-means algorithm is a popular and efficient approach for clustering and classification of data. My first introduction to K-means algorithm was when I was conducting research on image compression. In this applications, the purpose of clustering was to provide the ability to represent a group of objects or vectors by only one object/vector with an acceptable loss of information. More specifically, a clustering process in which the centroid of the cluster was optimum for the cluster and the…Continue
Added by Faramarz Azadegan on October 31, 2018 at 7:06am — No Comments
Systems integration is an increasingly utilized process that companies are realising the value of within their business. The process involves taking disparate systems are making them all work together as a whole.
Take, for instance, if your company has a sales arm. All the sales contact data will lie within a CRM, let’s say Salesforce - since it’s the major player in the CRM game. What if you wanted to grab all…Continue
Added by Graham Church on October 31, 2018 at 4:36am — No Comments
An article by A.H.Abdulrahman, J. M. Luna, 2 M. A. Vallejo 3 and S. Ventura with the title "Evaluation and comparison of open source software suites for data mining and knowledge discovery" (published by Wiley "Data Mining and Knowledge Discovery, Vol 7 Issue 3 2017 see this link) provides the research community with an extensive study on different features included in any data mining tool. The final score for…Continue
Added by jwork.ORG on October 30, 2018 at 3:18pm — No Comments
This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…Continue
For those who follow the stock market, October's been a pretty rough month, with overall market levels, as measured by major indexes such as the Russell 3000 and the Wilshire 5000, now down into correction territory of 10 percent declines. The falls, unfortunately, closely follow a …Continue
Added by steve miller on October 30, 2018 at 8:35am — No Comments
Summary: Digital Decisioning Platforms is a new segment identified by Forrester that marries Business Process Automation, Business Rules Management, and Advanced Analytics. For platform developers it’s a new way to slice the market. For users it eases integration of predictive models into the production environment.
Added by William Vorhies on October 30, 2018 at 8:30am — No Comments
Added by Packt Publishing on October 30, 2018 at 1:30am — No Comments
What do you do before purchasing something that costs more than a pack of gum? Whether you want to treat yourself to new sneakers, a laptop, or an overseas tour, processing an order without checking out similar products or offers and reading reviews doesn’t make much sense anymore. Thanks to comment sections on eCommerce sites, social nets, review platforms, or dedicated forums, you can learn a ton about a product or service and evaluate whether it’s a good value for money. Other customers,…Continue
Added by Kateryna Lytvynova on October 30, 2018 at 12:45am — No Comments
Here are a few off-the-beaten-path problems at the intersection of computer science (algorithms), probability, statistical science, set theory, and number theory. While they can easily be understood by beginners, finding a full solution to some of them is not easy, and some of the simple but deep questions below won't be answered for a long time, if ever, even by the best mathematicians living today. In some sense, this is the opposite of classroom exercises, as there is no sure path that…Continue
There is an important distinction related to data mining. First the difference between mining the data to find patterns and build models, and second using the results of data mining. Data Mining results inform the data mining process itself.
Cross-industry standard process…Continue
It’s not a secret that containers technology (popularly known as dockers) is becoming one of the top choices in software projects , but What about data projects/clusters? Many companies and projects have intentions to take advantages of it. Some examples are Cloudera  and the apache-spark-on-k8s project , personally, I suggest if you want more information as what exactly is called “Big Data as a Service” to check the last Strata Data…Continue
Added by Antonio Cachuan on October 28, 2018 at 4:59pm — No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.
Featured Resources and Technical ContributionsContinue
Added by Vincent Granville on October 28, 2018 at 9:00am — No Comments
Something that has always troubled me with statistics is the pretense of certainty. The conclusions – being closely associated with calculations – tend to be reached rapidly. I might only be starting to give a problem some thought – although a statistician has already drawn conclusions. Over time, this can make a person feel insecure about his intellectual capacity – and perhaps cause him to write a blog on the subject. Consider the simulated data below: a special program was…Continue
Added by Don Philip Faithful on October 28, 2018 at 8:05am — No Comments
Added by Pedro URIA RECIO on October 27, 2018 at 8:01pm — No Comments
You will find here a few tables of random digits, used for simulation purposes and/or testing or integration in statistical, mathematical, and machine learning algorithms. These tables are particularly useful if you want to share your algorithms or simulations, and make them replicable. We also provide techniques to use in applications where secrecy is critical, such as cryptography, bitcoin or lotteries: in this case, you don't want to share your table of random numbers; to the contrary you…Continue
Added by Vincent Granville on October 27, 2018 at 9:00am — No Comments
This article was written by Tirthajyoti Sarkar. Below is a summary. The full article (accessible from link at the bottom) also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.…Continue
Added by Andrea Manero-Bastin on October 26, 2018 at 5:00pm — No Comments
With the nascent stage of the data revolution past us, organisations are entering a new level of proficiency in handling data expertly. Gone are the days when organisations…Continue
Added by Ronald van Loon on October 26, 2018 at 3:35am — No Comments
Added by Benjamin Waxer on October 26, 2018 at 2:00am — No Comments
Below is my contrarian answer to one question recently posted on Quora.
It depends on what you mean by “no experience”. An NASA scientist who has processed petabytes of data and found great insights, for example discovered exoplanets, is de facto a data scientist and may have no interest in having his job title changed.
Then there is a bunch of people who call themselves “data science enthusiasts” and know nothing other than what they learned in a two-hour…Continue