In digital analytics, scoring Internet traffic is used to detect click fraud, and to find types of search keywords that convert well (to a sale). Quite often (for large ad networks) conversion data is poor or challenging: some clicks have a 0.2% conversion rate, some have a 30% - depending on the type of website, price, product, conversion type and other factors (even hour of the day has an impact).
One way to create a generic scoring system, to predict if a click is genuine or not, could rely on IP flags rather than conversion metrics. By IP flag, I mean IP blacklists such as Spamhaus, Barracuda or Adometry user IP and referral (web domain) blacklists, with various reason codes indicating why the IP's in question are blacklisted.
Since these 3rd party blacklists are the result of scoring system used by the vendors in question (Spamhaus, etc.) our generic score would be a score based on 3rd party scores, that is, a meta-score blending multiple scores - even blending scores that predict conversions, if possible.
In practice, I've found that
Has such a strategy been used in other industries - finance or marketing? Of course the challenge is to identify data buckets either with very high or very low concentration of blacklisted IP addresses, using decision trees, feature selection and Internet topology mappings. And confidence intervals for the scores.