Subscribe to DSC Newsletter

Roadblocks to Using Big Data: “First Mile” and “Last Mile” Challenges

LSST (Large Synoptic Survey Telescope)

In Part III of this Data Alchemy blog post, Kirk Borne, Ph.D., talks about the two biggest roadblocks to using big data in business. He also shares his views on big data as a critical asset to such diverse endeavors as exploring the vast reaches of space to helping businesses improve relationships with customers. Here he is in conversation with Anametrix CEO Pelin Thorogood. 

Pelin:  So what is the biggest roadblock for big data analytics in marketing you think? So talk about the opportunities for areas of focus. What are the roadblocks? 

Kirk:   I call the two biggest roadblocks the “first-mile challenge” and the “last-mile challenge.” The first-mile challenge relates to the fact that we have large quantities of distributed data, so it’s the data integration challenge. In cross-channel discovery and customer engagement, how do we acquire, integrate and make use of that data from many, many different sources? You can identify the same customer on different sites. But that’s actually one of the biggest difficulties in what’s called entity disambiguation. Big word there. How do you know that this user or this customer is the same one you see on other sites? You want to be sure you market to the right person, and you want to make sure you are marketing correctly to their preferences, not someone else’s. That’s the first-mile challenge. So the second roadblock I mentioned was the last-mile challenge. I’ll just say it’s how you get actionable intelligence from all of that data. 

Pelin:  That is absolutely the biggest challenge. What does it mean? Do the people getting the data know how to interpret it? It’s why we’re in business. I think the problem with data analytics in general is that you can do a lot of stuff with it. You can create pretty charts and dashboards and KPI’s. In the end, you have to ask: “Now what do I do with all this?” Unfortunately, it’s so much easier to just create a slew of reports than to create the two that would actually mean something for someone to take action on. 

But let’s switch gears for a moment to talk about privacy, one of the biggest concerns in the use of data by government and business. What do you think needs to happen to balance the benefits of collecting and analyzing data with the importance of preserving privacy? 

Kirk: We’re trying to bring balance, and that means we need more visibility into how business and government are using peoples’ data. We all get these mailings from companies that say, “Here’s our annual privacy statement.” What we really need from a business, for example, is how the data are used, a message that says: “We sent you 27 emails and recommended these products to you. We sold your data to the following six companies because we think they sell products you might be interested in.” That’s the kind of insight that would allow me to determine whether to opt-out. A couple years ago, Facebook opened up what they called its social graph. The open graph protocol basically showed the connections, the likes, the links among the billion-plus people on Facebook. People were up in arms. Now all of a sudden, people think: “Everything I’ve ever liked or posted is now going to be visible to everyone else in the world.” 

On a practical level, no one in the world has time to look at all that anyway. But even on the more technical level, it shouldn’t be a concern because what Facebook is really sharing is the graphs. The nodes of the graph might be individual people but the names don’t appear. But I think it’s important to help people understand there’s a benefit here. They’re going to be offered more relevant products, discounts and opportunities, as a result of businesses using data. That’s one thing we’re missing in the discussion. There’s value, as well as concerns, although we clearly need to be aware of the potential for misuse. 

Pelin: Let me ask you a final question. Scientists recently announced the new “inflationary” theory of the origins of the universe that’s different from the standard big bang models. Where do you think big data fits in helping us understand the vastness of space, as well as other areas of discovery in which data sets are so extraordinary large? 

Kirk: That’s a big question. I’m actually working on a project involving a new large telescope, LSST (Large Synoptic Survey Telescope) scheduled to start construction on July 1st. It’s a billion dollar project to be built on a mountaintop in the Andes Mountains of Chile. It will basically capture a movie of the night sky. That’s the simplest way to say it, although it’s more technical than that. It’s going to take very deep images of everything that’s visible from its perch in the Southern Hemisphere. It’s going to take three days to do a complete map of the sky. Then it will repeat that again and again over a period of ten years. So during that period, it will record about 50 billion sources in the sky, measured 800 to 900 times over. That’s nearly 50 trillion entries in the database, each one containing about 200 different scientific parameters. That’s a lot of data. 

The new telescope is just an example of what we can see all around us in big data − whether it’s business or government or science. One of the most important values of big data is that we can now have a complete model, so to speak, of whatever we’re collecting the data on. For example, you can do a complete DNA sequence of a person’s genome. You now have access to the information you need to know whether that person will get grey hair at an early age. Whether they will lose their teeth. Whether they’ll get Alzheimer’s disease. Whether they’re susceptible to certain cancers. The problem is decoding it. The data is in ones and zeroes. It’s all bytes. How do you go from that data representation of knowledge to the actual knowledge itself? 

Whether it’s our customers or a scientific problem or investigation of “big bang” universes − we need all of the power of data science now to ask the right questions and analyze findings. That’s done just as any scientist would propose a hypothesis, test that hypothesis and see what works. What explains what you see in a piece of scientific research? What makes the customer buy this product or another? What makes our universe tick the way it does? 

Big data is changing everything. We live in a knowledge economy. We want to use data for discovery, for change where it is needed. Questions we can look at are just as relevant in business, as in the exploration of space, or improvements in healthcare or assisting third-world economies to grow. Big data is the critical asset. 

Pelin: Thank you, Dr. Borne. This has been an extraordinary opportunity to explore the breadth and depth of what data can mean to our work and success in almost every domain. We wish you well in all your many projects and interests.


Views: 1043

Tags: Kirk Borne, analytics, anametrix, big, big data, data, predictive analytics, science

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Kumar Chinnakali on June 2, 2014 at 4:45am

Hi Montano, Thanks for sharing the conversations. And the LSST is very  interesting and possible could you please share the blogs, information from Dr. Borne.

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service