Subscribe to DSC Newsletter

‘Rogue Algorithms’ and the Dark Side of Big Data

Most of us, unless we’re insurance actuaries or Wall Street quantitative analysts, have only a vague notion of algorithms and how they work. But they actually affect our daily lives by a considerable amount. Algorithms are a set of instructions followed by computers to solve problems. The hidden algorithms of Big Data might connect you with a great music suggestion on Pandora, a job lead on LinkedIn or the love of your life on Match.com.

These mathematical models are supposed to be neutral. But former Wall Street quant Cathy O’Neil, who had an insider’s view of algorithms for years, believes that they are quite the opposite. In her book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, O’Neil says these WMDs are ticking time-bombs that are well-intended but ultimately reinforce harmful stereotypes, especially of the poor and minorities, and become “secret models wielding arbitrary punishments.”

Models and Hunches

Algorithms are not the exclusive focus of Weapons of Math Destruction. The focus is more broadly on mathematical models of the world — and on why some are healthy and useful while others grow toxic. Any model of the world, mathematical or otherwise, begins with a hunch, an instinct about a deeper logic beneath the surface of things. Here is where the human element, and our potential for bias and faulty assumptions, creeps in. To be sure, a hunch or working thesis is part of the scientific method. In this phase of inquiry, human intuition can be fruitful, provided there is a mechanism by which those initial hunches can be tested and, if necessary, corrected.

O’Neil cites the new generation of baseball metrics (a story told in Michael Lewis’s Moneyball) as a healthy example of this process. Moneyball began with Oakland A’s General Manager Billy Beane’s hunch that using performance metrics such as runs batted in (RBIs) were overrated, while other more obscure measures (like on base percentage) were better predictors of overall success. Statistician Bill James began crunching the numbers and putting together models that Beane could use in his decisions about which players to acquire and hold onto, and which to let go.

While sports enthusiasts love to debate the issue, this method of evaluating talent is now widely embraced across baseball, and gaining traction in other sports as well. The Moneyball model works, O’Neil says, for a few simple reasons. First, it is relatively transparent: Anyone with basic math skills can grasp the inputs and outputs. Second, its objectives (more wins) are clear, and appropriately quantifiable. Third, there is a self-correcting feedback mechanism: a constant stream of new inputs and outputs by which the model can be honed and refined.

Read more, here

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 2697

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service