Comments - 10 Modern Statistical Concepts Discovered by Data Scientists - Data Science Central2021-07-25T09:55:52Zhttps://www.datasciencecentral.com/profiles/comment/feed?attachedTo=6448529%3ABlogPost%3A251563&xn_auth=noYou say " I am not aware of a…tag:www.datasciencecentral.com,2019-02-06:6448529:Comment:7998672019-02-06T07:07:45.457ZKlaus Wassermannhttps://www.datasciencecentral.com/profile/KlausWassermamm
<p>You say " <span>I am not aware of any statistical science contribution to data science, but if you know one, you are welcome to share".</span></p>
<p><span>I cannot imagine a more arrogant statement. It looks like a statement of the 10year old after the 3rd class of "math", when he can apply basic calculation and calls it "math". The same kind of behavior was abound 20 years ago when machine learning was hyped. Today, checking what is called "data science" is not a yota…</span></p>
<p>You say " <span>I am not aware of any statistical science contribution to data science, but if you know one, you are welcome to share".</span></p>
<p><span>I cannot imagine a more arrogant statement. It looks like a statement of the 10year old after the 3rd class of "math", when he can apply basic calculation and calls it "math". The same kind of behavior was abound 20 years ago when machine learning was hyped. Today, checking what is called "data science" is not a yota further.</span></p>
<p><span>Boy, you are embedded in a culture and a cultural field of transmitting ideas. Denying this is simply stupid.</span></p> Thank you very much, really g…tag:www.datasciencecentral.com,2016-06-20:6448529:Comment:4378842016-06-20T17:22:44.028ZSander Stepanovhttps://www.datasciencecentral.com/profile/SanderStepanov
<p>Thank you very much, really great material!!</p>
<p>only may you pls share material from #2 <a href="http://www.datasciencecentral.com/page/search?q=bucketization" target="_blank">Bucketization</a><span> , link seems to be is broken</span></p>
<p>Thank you very much, really great material!!</p>
<p>only may you pls share material from #2 <a href="http://www.datasciencecentral.com/page/search?q=bucketization" target="_blank">Bucketization</a><span> , link seems to be is broken</span></p> Well and diplomatically said …tag:www.datasciencecentral.com,2015-07-31:6448529:Comment:3062272015-07-31T06:51:31.102ZDr Vincent Micalihttps://www.datasciencecentral.com/profile/DrVincentMicali
<p>Well and diplomatically said Prof Hart</p>
<p>Dear Vincent,</p>
<p>Having served as a Statistician analysing Data for 40 years, there is a touch of arrogance in the statement that Statisticians "... they know everything". For the lucid Data Scientists, is you permit,safely considering one myself, it is indeed a contradiction, since Statisticians (whether Bayesians or non-Bayesians) know very well that the probability of "knowing everything" is zero. So, perhaps I should take the prior of…</p>
<p>Well and diplomatically said Prof Hart</p>
<p>Dear Vincent,</p>
<p>Having served as a Statistician analysing Data for 40 years, there is a touch of arrogance in the statement that Statisticians "... they know everything". For the lucid Data Scientists, is you permit,safely considering one myself, it is indeed a contradiction, since Statisticians (whether Bayesians or non-Bayesians) know very well that the probability of "knowing everything" is zero. So, perhaps I should take the prior of Carlos to inform my predictive probability that you made those statements with "tongue in cheek" to spurn a discussion. If the conditional is true, then well done: you succeeded; if it's false, then you should seriously revisit your statements, back them up with scientific evidence and perhaps a good starting point is the works of Sir Harold Jeffreys</p>
<p>Cheers and take care</p>
<p>Dr Vincent Micali</p>
<p>MSc (Warwick), PhD (UFS)</p> Dear Vincent,
"unaware of an…tag:www.datasciencecentral.com,2015-03-06:6448529:Comment:2556522015-03-06T23:10:20.022ZCarlos Ayahttps://www.datasciencecentral.com/profile/CarlosAya
<p>Dear Vincent,</p>
<p></p>
<p>"<span>unaware of any statistical science contribution to data science</span>" -> Tongue in cheek, right :) ?</p>
<p></p>
<p>I do acknowledge that data abundance and (more importantly) the universal availability of computers have posed a tremendous challenge to mathematics and statistics. Anyone can modify an existing algorithm that fails, and "make it work" for his/her particular case - or even invent "new" ones.</p>
<p></p>
<p>But this does not mean this is…</p>
<p>Dear Vincent,</p>
<p></p>
<p>"<span>unaware of any statistical science contribution to data science</span>" -> Tongue in cheek, right :) ?</p>
<p></p>
<p>I do acknowledge that data abundance and (more importantly) the universal availability of computers have posed a tremendous challenge to mathematics and statistics. Anyone can modify an existing algorithm that fails, and "make it work" for his/her particular case - or even invent "new" ones.</p>
<p></p>
<p>But this does not mean this is science... in the sense that showing that it works, even in many situations, does not explain why they work, and how they could fail (i.e. what underlying assumptions are required for it to work).</p>
<p></p>
<p>Yes, it is published as research - but, believe me, it is more an _unsolved problem_ for a mathematician or statistician than well founded finished research.</p>
<p></p>
<p>If you allow me the analogy, is like that in ancient Babylon it was common knowledge among builders that certain Pythagorean triangles existed - but it required formal geometry to explain what was really going on. </p>
<p></p>
<p>So, you want recent contributions? Here one: search for "functional data analysis" in google scholar ... enjoy :)</p>
<p></p>
<p>Anyway, data scientists and everybody ... yes, keep using computers, your challenges are welcome (but even better if you join the "theoretical" camp and help to explain why...)</p>
<p></p>
<p>Kind regards</p>
<p>Carlos</p>
<p></p>
<p></p> G'day Vincent:
as always I f…tag:www.datasciencecentral.com,2015-03-02:6448529:Comment:2539432015-03-02T17:52:24.160ZGeorge F. Harthttps://www.datasciencecentral.com/profile/GeorgeFHart
<p style="margin-bottom: 0in; line-height: 100%;">G'day Vincent:</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">as always I find your comment both interesting and insightful. I have always regarded myself as a data-analyst – even before the modern idea was invented [I'm 80]. I have also applied statistical reasoning for the past 60 years to data analysis. I do agree that data scientists have contributed significantly to statistics in…</p>
<p style="margin-bottom: 0in; line-height: 100%;">G'day Vincent:</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">as always I find your comment both interesting and insightful. I have always regarded myself as a data-analyst – even before the modern idea was invented [I'm 80]. I have also applied statistical reasoning for the past 60 years to data analysis. I do agree that data scientists have contributed significantly to statistics in the sense I understand the field. However, to be 'unaware of any statistical science contribution to data science' must have been ghost-written. It is not you as I have read you over the past few years! I assume you are being provocative simply to get something going – which is my own method of teaching.</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">Mathematical methods have nothing to do with it. The underlying concepts of statistics are what statistical science has contributed to data analysis.</p>
<p style="margin-bottom: 0in; line-height: 100%;">Start with:</p>
<p style="margin-bottom: 0in; line-height: 100%;"><a name="magicparlabel-7721" id="magicparlabel-7721"></a> “ <em>what is the chance of a random sample taken from a location being simply a variant within the population of interest, or alternatively, that it is from a totally different population”.</em></p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;"><em><span style="font-style: normal;">Or the application of basic functions:</span></em></p>
<p style="margin-bottom: 0in; line-height: 100%;"><em>“mean(), median(), sd(), var(), min(), max(), range(), summary() , sort(), order(), rank() , exp(), log(), sin(), cos(), tan() [radians] , length() , rev() , sum(), cumsum(), prod(), cumprod(), round(), ceil(), floor(), signif() , which(), which.max() , any(), all(),</em> and <em>mode()”</em></p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">Or the basic model:</p>
<p style="margin-bottom: 0in; line-height: 100%;"><a name="magicparlabel-12088" id="magicparlabel-12088"></a> <em>“<b>Y = (something) + (error of measurement)</b></em>, where <b>Y</b> is said to be the <em>dependent variable</em> that is being measured, and (something) is some relationship among the so-called <em>independent variables</em> that control or predict <b>Y”</b></p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; font-weight: normal; line-height: 100%;">Or</p>
<p style="margin-bottom: 0in; line-height: 100%;"><a name="magicparlabel-12096" id="magicparlabel-12096"></a> “ rejecting the hypothesis vs failing to reject the hypothesis”.</p>
<p style="margin-bottom: 0in; line-height: 100%;">“To judge the reliability of any experimental result it must be compared with an estimate of it's error i.e. a test of significance. The test of significance separates the subjective guess from fact [more correctly the failure to reject a hypothesis pertaining to a fact].”</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">Or:</p>
<p style="margin-bottom: 0in; line-height: 100%;"><a name="magicparlabel-16455" id="magicparlabel-16455"></a> “The innate control of error by multiple replication” This provides a major advantage to, and is a principal reason for, the success of modern 'big data' analysis. It leads to theridea that in data analysis we are dealing with the total population not a statistical sample [we both know that is not true but it is suffice to justify what is done].</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;"><a name="magicparlabel-25174" id="magicparlabel-25174"></a> I could go on but you and most of your reader know this stuff already. Data analysis has grown but it still have the underpinnings of statistical analysis. For the future of statistical analysis I advise keeping a close-eye on deep learning methods.</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">Luv and kisses as always,</p>
<p style="margin-bottom: 0in; line-height: 100%;"></p>
<p style="margin-bottom: 0in; line-height: 100%;">George Hart,</p>
<p style="margin-bottom: 0in; line-height: 100%;">Professor emeritus,</p>
<p style="margin-bottom: 0in; line-height: 100%;">LSU.</p> There's a good article on ran…tag:www.datasciencecentral.com,2015-02-23:6448529:Comment:2521742015-02-23T16:45:42.565ZSione Paluhttps://www.datasciencecentral.com/profile/SioneKPalu
<p>There's a good article on random number generation by Prof. Cleve Moler from MathWorks here:</p>
<p></p>
<p><a href="http://www.mathworks.com/tagteam/9674_randomthoughts.pdf" target="_blank">http://www.mathworks.com/tagteam/9674_randomthoughts.pdf</a></p>
<p>There's a good article on random number generation by Prof. Cleve Moler from MathWorks here:</p>
<p></p>
<p><a href="http://www.mathworks.com/tagteam/9674_randomthoughts.pdf" target="_blank">http://www.mathworks.com/tagteam/9674_randomthoughts.pdf</a></p>