.

# Data Science: Do We Really Need Math?

It is sometimes said that you don't need to know math to be a data scientist. Sometimes the opposite is said, after all, data science is supposed to be a science! Regardless, below are a few of my articles featuring how data science and math can benefit from each other - not just math to solve data science problems, but also data science to solve math problems.

Articles

References

Views: 2051

Comment

Join Data Science Central

Comment by APTRON Noida on June 15, 2021 at 3:29am

It’s very helpful for data scientists to have a solid understanding of the math and statistics behind those algorithms so they can choose the best algorithm for their problems and datasets and thus make more accurate predictions. So embrace the pain, and dive into the math! It’s not as tough as you think.

Thanks for Sharing.

Comment by Michael Morgan on January 22, 2020 at 4:51am

I can offer a couple of observation on this question that may help data scientists understand the most immediate needs for understanding the need to understand just two basic mathematical concepts.  These are not difficult to understand intuitively, but in data science, and in neural networks specifically, we have changed the statistical names to terms that data scientists, and computer scientists in particular, can use as handles for problems we run into.  In a new blog, I’ll trace these in recent data science work, but here to summarize:

1.  Vanishing gradient.  This is the main reason for ongoing improvements and tweaks to NNs.  Especially problems emerging when trying to backpropogqte from a near-vanishing gradient,  The true cause of this persistent problem Is multicollinearity among parameters.  It is a statistical problem, but not hard for those not trained in multivariate statistics to understand.  Knowing this can help the NN analyst to choose better mini-interventions than those currently popular.

2.  Too many parameters.  Even if multicollinearity is controlled through the right preprocessing of the data, or perhaps some sort of masking to avoid letting the model continue to use data points that have already served up their information, the number of parameters can mushroom and preclude sensible inference and/or help with new NN innovations.  The capability to interpret parameters may be secondary to predictive power, but the most successful NN these days appear to incorporate some kind of generative component, such as adversarial tests of intermediate portions of the NN (GAN).

2.