The nature of machine learning projects

Click for: original source

Michael Ohlsson article about building a data-driven product. Building a data-driven product differs in many ways from how one would create a more conventional software product. A machine learning system is still a software system, but the process to develop the system is different.

These differences are very important to understand for all the stakeholders and it is key to have a common view on this for a project to be successful. With this post author will briefly try to explain the machine learning process and why it calls for a different approach and mindset. In order to adopt this mindset, the most important thing to understand is that developing a machine learning system is much like a scientific process instead of a traditional software development process. However, the whole solution still requires a lot of software engineering practices.

In traditional programming, you write down all the rules that the program needs to have for it to perform and accomplish a specific task and produce the desired result. The program takes some data as input and this data is then processed as stated by the rules, and will hopefully, in the end, return the correct result. On the contrary, a machine learning system is instead trained rather than being programmed explicitly. The input to such a system is not just the data but also the expected result for that data and the output will then be a set of rules (this is also called a model in machine learning vocabulary).

Machine learning is not like any other technology and to boil down all this to its core components we could consider a few important rules:

  • Create a common ground of understanding, this will ensure the right mindset
  • State early how progress should be measured
  • Communicate clearly how different machine learning concepts works
  • Acknowledge and consider the inherited uncertainty, it is part of the process

You will also get resources to further reading. Great read!

[Read More]

Tags machine-learning big-data data-science