Subscribe to DSC Newsletter

Fraud Detection Using Deep Learning ML Techniques at Paypal

This topic combines two of the most difficult or least understood ML techniques and challenges.  Fraud detection invariably falls short of complete automatic detection because of the false positive rate and the need for at least some human intervention, typically on a case-by-case basis.  Deep Learning, one of the most far flung borders of ML research utilizing neural net architecture but unsupervised model development.

Our friends at H2O University referred me to this very interesting 15 minute video by Venkatatesh Ramanathan who shows a real-world and very big data application of Deep Learning to fraud detection at Paypal. 

See it here:

Fraud Detection with Deep Learning at Paypal 

Views: 6446

Tags: deep learning, fraud detection


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Vasily Nekrasov on November 2, 2015 at 1:24pm

I did Fraud Management by a company similar to PayPal as I was a student.

And I am quite skeptical about what this guy tells us. Indeed he discloses very little info (ok, I can understand why) but look at 8:42 - he states he has 1500 features. Highly implausible but even if, how many of them are really significant?! For a transaction you have IP, timestamp, referral, amount, frequency of transactions ... probably a dozen of other important parameters ... but 1500...
Secondly, he does not compare his model with any tractable and simple setup like a logistic regression or an SVM. By the company I worked for they did have complicated models but refrained from using them because the performance improvement was marginal, compared to communication barriers (image you need to explain the basics of deep learning to a call center operator, who in turn needs to explain to an angry customer why his account was blocked under suspect of a fraud). 

Comment by Sione Palu on November 1, 2015 at 12:47pm

Fraud detection is a big challenge in modern automated data analytics of today. My preferred solution to fraud detection is to deploy an ensemble of various models,  then aggregate the majority votes of them for a single binary decision output (fraud/no-fraud).  Even state-of-the-art random-forest or convolutional deep neural networks on their own could still produced decision that under-performed  one from an ensemble of models. There's lots of new stuff which came out in recent years for using dynamic matrix decomposition for outlier/anomaly detection in an unsupervised setting, so IMO it makes sense to use multiple models than a single one on its own even though that single model is superior (on one to one performance comparison to those that comprised of an ensemble).

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service