Subscribe to DSC Newsletter

Why do some traditional engineers not trust Data Science?

Introduction

I had this conversation some time ago with an Engineer who came from a traditional background.

By that I mean, he had been in the same industry (heavy engineering) for 30 years.

Of these, he had been in the same company for 25 years (and this was his second job).

 

After understanding from me about data science, he said that as an engineer, he did not trust

data science.

 

I was curious and tried to understand more about what he meant

 

He elaborated that he did not trust anything that was not based on empirical experiments

 

In his world, to understand the behaviour of a phenomenon, you needed to carry out a physical experiment to model that phenomenon and only then could you predict it.

 

An example of this approach is the Wind Tunnel

 

Image source - wind tunnel at NASA

So, as someone who teaches AI / machine learning at #universityofoxford – it made me think that this approach has some advantages to introducing people to AI/ML.

 

Specifically

  1. We could use physics based modelling to lead to mathematical modelling
  2. From mathematical modelling, we could introduce machine learning
  3. Thus, we could introduce machine learning and deep learning to engineers based on knowledge that they already know

I shall elaborate this idea below

This approach forms a part of my forthcoming book. If you are interested in knowing more, please connect with me.

What is mathematical modelling

A mathematical model describes a system in terms of mathematical constructs.

The process of developing a mathematical model is termed mathematical modelling.

Mathematical models are widely used in the natural sciences, engineering and social sciences.   .

Once a system is modelled, it's behaviour can be explained by the model and we could make predictions about it's future behaviour.

 

Mathematical models can be implemented using many mathematical constructs such as statistical models, differential equations, game theoretic models etc.  

The model itself could be made up of variables, control variables, governing equations, assumptions, constraints, initial and boundary conditions.

Mathematical models are usually composed of relationships and variables.

Relationships can be described by operators such as algebraic operators, functions, differential operators, etc.

The variables represent some properties of the system.  

The actual model is a set of functions that describe the relations between the different variables.

 Models describe our beliefs about how the world functions.

Since prehistoric times simple models such as maps and diagrams have been used to capture the state of a system.

When models are built using maths, the structure of mathematics introduces a formal rigour.

However, any modelling, including mathematical modelling, involves making compromises.

These compromises can arise from two main areas

  1. The scope of the system – The system comprises the area of interest for our study. Real world phenomenon are often too complicated to model in  their entirety. Hence, the definition of a system involves the scope of the features being selected for the study and the boundary conditions. These involve a compromise in building the model depending on which factors we choose to include or ignore.
  2. The second element of compromise involves the level of mathematical and computational feasibility. Since the objective of the model is to produce a result, the model must be computationally feasible. This could lead to some compromises in modelling.

Similarities and differences between machine learning and mathematical modelling

The steps for mathematical modelling typically include:

  • Specify the problem statement in terms of what do you want to study (optimise / predict / determine / classify / understand).
  • Set up a mathematical model for the problem based on variables, assumptions and constraints. Formulate an equation to relate the variables
  • Solve the equation which represents the model
  • Improve the model by incorporating additional considerations and constraints to the baseline model.

These steps are similar to machine learning.

 

Consider the problem of predicting the trajectory of a ball.

 

 

 

 Image source - lumen learning

 

Knowing the initial position, the velocity and the angle of the trajectory, it is easy to determine the trajectory of the ball at any point in time.

This problem can be solved using basic high school equations by modelling the physics.

However, the same problem (determining the trajectory of a baseball) can be formulated in two ways i.e. 

1) A deterministic equation which models the problem based on a theoretical understanding as we have seen before or

2) By learning from lots of examples from history and trying to predict its trajectory

In both cases, we are trying to model the problem as an equation and solve the equation.

In the first case, we are modelling the problem as a physical / empirical problem and determining the equation through an experiment.

In the second case, we formulate the problem based historical understanding (past examples).

 

If we have direct theoretical understanding of the problem, the mathematical modelling approach works best (assuming that the equations are solvable in a reasonable time and cost).

However, if you have no direct knowledge about the behaviour of a system, you cannot formulate a mathematical model to describe it and make accurate predictions.

In the later case, you could still model the behaviour of the system as long as you have a large number of examples of outcomes.

Given enough examples of outcomes from the training data, you can learn the underlying pattern of the system - from which you can infer future behaviour.

This idea is of course very familiar to us in machine learning - for example in predicting house prices from historical data 

Conclusions

The two approaches i.e. the mathematical modelling approach based on physics based modelling and the ‘learning from past observations i.e. machine learning’ – complement each other.

 

In both cases, you are formulating a model based on an equation.

If we have direct theoretical understanding of the problem, the mathematical modelling approach works.

However, if you have no direct knowledge about the behaviour of a system, you cannot formulate any mathematical model to describe it and make accurate predictions.

 

In the later case, you could still model the behaviour of the system as long as you have a large number of examples of outcomes.

Given enough examples of outcomes from the training data, you can learn the underlying pattern of the system - from which you can infer future behaviour

 

In both cases, you need the equations to solvable in a reasonable time and cost

From a learning standpoint, we could use the approach to introduce more engineers to machine learning and deep learning based on knowledge that they already know.

 

This approach forms a part of my forthcoming book. If you are interested in knowing more, please connect with me.

 

Notes:  

I used physics based modelling and mathematical modelling interchangeably.

In this case, I am using the terms to say that we are modelling a phenomenon empirically    

References

Machine Learning v.s. Mathematical modelling

How do you combine machine learning and physics based modelling

An introduction to mathematical modelling - by Glenn Marion

 

 

 

Views: 1322

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service