Author: Marcos Sponton
A few comments for those who are about to invest on Machine Learning intensive project
During a conversation I had with Peter Norvig, we discussed about the kind of projects that we do at Machinalis and how strange does it feels to say that "we are a Machine Learning company": In many projects, the amount of effort spent on R&D on Machine Learning is usually a small fraction of the total effort, or it’s not even there because we plan it for a future phase after building the application first. At this moment, Norvig quoted a friend of him who said:
"Machine Learning development is like the raisins in a raisin bread: 1. You need the bread first 2. It's just a few tiny raisins but without it you would just have plain bread."
So how do companies purchase this raisin bread?
Typically, two things can happen:
A large company, at a higher level, decides to move away from standard tools, given that "our business and our data are different/peculiar", incorporating Machine Learning or Data Science into their processes. In these situations they call us just to get some raisins.
A startup, after a few funding rounds, decides to transform the prototype they built to a production ready scalable MVP. In these situations they call us because they want the whole raisin bread, or perhaps some other kind of pudding which will need raisins at some point in the future.
Even if eating this is trendy, many people are not used to the new flavour, and it might not sit well with them.
The trick is not buying "the Machine Learning thingy..."
- … realize that by doing so you're not getting into a typical software development project, but a *research* and development one. Even for some better-known problems that do not strictly follow this rule; the typical level of uncertainty is higher than in traditional development projects, and much higher if the expectations about the deliverable involve identify, design and/or achieve specific values for metrics like precision or accuracy.
- … understand that there can be a large gap between the research paper that says "it's possible" to the production ready software, sometimes so big that it can't be covered today.
- … view your predictive models created in the process as an asset. Your competitors don't have access to this model, it is a piece of productive and potentially scalable knowledge which emerged from your very own combination of data and constraints. The team working on this is not wasting time as a "necessary evil", but investing effort in the main asset of the process.
- … do not expect that Machine Learning will be a black box that will produce magical results. A central principle of the process is "garbage in, garbage out" and if your goals are not clear, Machine Learning won't make them any clearer.
- … if you have a business background, you're willing to incorporate some Machine Learning concepts that allows you to communicate a detailed vision of the problem you want to tackle from the business perspective to the team. The technical team will respond in kind by learning about the business side. You don’t want that, after having worked as separate silos, at the end the project you get a result that’s attractive from the scientific standpoint, but insignificant to your business (which means that you spent thousand and wasted time that your competitors have used better).
- … consider that perhaps with 20% of the effort using standard and out-of-the-box algorithms you may get 80% of the expected results for the set metrics. Improving that result can be the remaining 80% of the effort... or even more.
- … you’re aware that your Machine Learning team is usually pretty expensive, and you don’t want them having to deal with integrating their work with the existing production pipeline.
- … your team is ready to start working with probabilistic results.
- … do not buy it just because it’s popular: Startups in their early stages often do not need a full raisin bread, but just some bread prototype, perhaps a sample raisin and check if there's interest on their street for a bakery.
Do you think you can overcome all these obstacles? Congratulations, you have a good chance of enjoying a nice raisin bread in your business.
(*) At this point is the same if you’re about to start a ‘Data Science’, ‘Big Data’, ‘Artificial Intelligence’ project instead of a ‘Machine Learning’ one, and you don’t see the difference among those terms.
Original article here