As the world is getting more tech savvy and advancements made in the information technology especially in the healthcare industry has opened areas in data mining and machine learning. Within the area of data mining one technique which has gained a lot of popularity as well as skepticism among the auditors and fraud detectives is Benford’s Law or “The Law of First digit.
In the past some researchers in Canada used the Benford’s Law distribution to detect anomalies within the claims…Continue
If you haven’t started using artificial intelligence in your business, you’re falling behind on the curve. Many business owners today are leveraging AI, whether they are aware of it or not. This is done through everyday business software suites that integrate machine learning and automation to carry out such functions as email communications, voice recognition and response and predictive analysis.
The extent to which businesses employ AI solutions needs to be increased if they are to…Continue
Added by Derek Iwasiuk on January 2, 2017 at 2:00am — No Comments
Best Subset Regression method can be used to create a best-fitting regression model. This technique of model building helps to identify which predictor (independent) variables should be included in a multiple regression model(MLR).
This method comprises of scrutinizing all of the models created from all possible permutation combination of predictor variables. This technique uses the R Squared value to check for the best model. Considering the level of complexity involved in creating…Continue
I had some magical moments in my life. Perhaps the most magical was the summer I went to work wearing shorts and running shoes. Captain C. said to me, as I sat in our shared office, “If you want to work during the weekend, you can borrow the keys to the building.” It didn’t seem like a big deal at the time. I had already borrowed a military vehicle that, as per Captain C., I could park anywhere in Canada without receiving a parking ticket. So I borrowed the keys to the building. I went…
This post covers the following tasks using R programming:
There’s a lot of buzzword around the term “Sentiment Analysis” and the various ways of doing it. Great! So you report with reasonable accuracies what the sentiment about a particular brand or product is.
After publishing this report, your client comes back to you and…Continue
Added by Vivek Kalyanarangan on November 4, 2016 at 5:00am — No Comments
This is an article which attempts to detect dependable variables with non-linear method.
I'm going to apply a method for checking variable dependency which was introduced in my previous post. Because the "dependency" I get with this rule is not true dependency as defined in Probability then I will call variables practically dependent at a confidence level…Continue
Added by Maiia Bakhova on November 2, 2016 at 11:30am — No Comments
Imagine I show you a book review, on amazon.com, say. Imagine I hide the number of stars, – all you get to see is the number of stars. And now I’m asking you, that review, is it good or bad?…Continue
The different tasks that data scientists may hold are very diverse, but no matter what niche a data scientist fills in their line of work, being specialized in certain technical areas of software development is extremely important. The following skills are some of the most important techniques that a data scientist will need to have in order to perform software development properly.
Before anything else, a data…
Added by Jennifer Livingston on August 29, 2016 at 8:00pm — No Comments
This is the most comprehensive guide to Ratio Analysis / Financial Statement Analysis
This expert-written guide goes beyond the usual gibberish and explore practical Financial Statement Analysis as used by investment bankers and equity research analysts.
Table of Content:…Continue
Added by rajesh dhnashire on August 3, 2016 at 12:30am — No Comments
More and more organizations today are moving to unified communications (UC) platforms for better communications within their organization, with their customers and with their partners. These platforms combine voice, email, chat and web into a seamless Omni-channel experience for its users. They today boost of a number of features, but most of them provide either static or rule based experiences. Given that these platforms generate tons of data, can this data be used to improve user…Continue
Added by Kumaran Ponnambalam on March 30, 2016 at 4:30pm — No Comments
How many times have you heard managers and colleagues complain about the quality of the data in a particular report, system or database? People often describe poor quality data as unreliable or not trustworthy. Defining exactly what high or low quality data is, why it is a certain quality level and how to manage and improve it is often a trickier…Continue
Added by Zygimantas Jacikevicius on February 2, 2016 at 2:00am — No Comments
By Matt Holzapfel, procurement/sourcing product lead at Tamr
Sourcing managers often pride themselves on their deep understanding of their suppliers, but as supplier count grows, it becomes almost impossible for managers to stay up-to-date on the health and activities of all their suppliers. Compounding the problem is the constantly growing amount of information being pushed to managers. A new approach is needed that allows managers to focus their energy on…Continue
Added by Jason Bailey on January 18, 2016 at 2:30pm — No Comments
By Matt Holzapfel, procurement/sourcing product lead at Tamr
The business case for almost every merger and acquisition includes an assumption of significant cost savings. Unfortunately, achieving these cost savings is often harder than anticipated, which is one reason why 70-90% of mergers and acquisitions fail.
These numbers highlight the challenge of merging the…Continue
Added by Jason Bailey on January 18, 2016 at 2:29pm — No Comments
If you want quickly to get started with data analysis, here is my advise on free software programs that I use every day for data analysis, statistics and data mining.
R-package - a software for statistical computing written in C. Script oriented.
Added by jwork.ORG on December 23, 2015 at 3:30pm — No Comments
Which database is best? The question, obviously, depends on what you want to use it for.
I, like most analysts, want to use a database to warehouse, process, and manipulate data—and there’s no shortage of thoughtful commentary outlining the types of databases I should prefer. But these evaluations, which typically discuss databases in terms of …Continue
Added by Benn Stancil on December 9, 2015 at 9:00am — No Comments
Learning any new skill is hard. There are too many possibilities, and the goal seems massive and intimidating.
Enter the Pareto Principle.
The Pareto Principle, also known as the 80/20 rule, suggests that 80 percent of results come from 20 percent of efforts. It can be applied to everything from business to language, even learning how to use R.
With just a…Continue
Added by Divya Parmar on December 3, 2015 at 8:42am — No Comments
Life scientists collect similar type of data on daily basis. Statistical analysis of this data is often performed using SAS programming techniques. Programming for each dataset is a time consuming job. The objective of this paper is to show how SAS programs are created for systematic analysis of raw data to develop a linear regression model for prediction. Then to show how PROC SQL can be used to replace several data steps in the code. Finally to show how SAS macros are created on these…Continue
Added by Venu Perla PhD on October 10, 2015 at 9:00am — No Comments
Most data scientists and statisticians agree that predictive modeling is both art and science yet, relatively little to no air time is given to describing the art. This post describes one piece of the art of modeling called feature engineering which expands the number of variables you have to build a model. I offer six ways to implement feature engineering and provide…Continue
I have written about R in the past, and it is one of the hottest tools for data analysis today. To further demonstrate the power of R, I found click-through rate data…Continue