Data Science Central2019-10-18T18:29:55ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPeryginhttps://storage.ning.com/topology/rest/1.0/file/get/2801284316?profile=RESIZE_48X48&width=48&height=48&crop=1%3A1https://www.datasciencecentral.com/forum/topic/listForContributor?user=2wdawy68ixuem&%3Bfeed=yes&%3Bxn_auth=no&feed=yes&xn_auth=noData science degreetag:www.datasciencecentral.com,2019-10-16:6448529:Topic:8995432019-10-16T21:24:56.692ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>Dear forum members,</p>
<p></p>
<p>I have started working as a customer data insight analyst after working as a consultant in a different domain for 14 years.</p>
<p>I got this job because i know general sql and python and formally educated in mathematics and computer applications.</p>
<p></p>
<p>My job involves customer churn analysis and my company is using mostly excel /tableau, i am exploring few python libraries like pandas but i am not able to implement the data science concepts like…</p>
<p>Dear forum members,</p>
<p></p>
<p>I have started working as a customer data insight analyst after working as a consultant in a different domain for 14 years.</p>
<p>I got this job because i know general sql and python and formally educated in mathematics and computer applications.</p>
<p></p>
<p>My job involves customer churn analysis and my company is using mostly excel /tableau, i am exploring few python libraries like pandas but i am not able to implement the data science concepts like predictive analysis due to pressure to produce outputs and i end up working in excel.</p>
<p></p>
<p>In my company, there is no data scientist and people are inclined to use excel but I am aspiring to become a data scientist but not formally educated in data science.</p>
<p></p>
<p>Can anyone suggest me if taking a data science degree can speed up my skills to apply the data science techniques in my company?</p>
<p></p>
<p>Regards,</p>
<p>Lucky </p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p> </p> Diminishing returns in econometricstag:www.datasciencecentral.com,2019-10-15:6448529:Topic:8985692019-10-15T11:04:09.037ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>I was wondering if anyone here has much experience in building econometric models - specifically in calculating diminishing returns as there are tonnes of different ways to go about this. For simplicity, I have previously used <span>an exponential decay (e to the power of -(a.x) where a is the rate of diminishing returns and x is the rate of media spend - but there are many other ways to model this (e.g. Linear log models, Multiplicative Competitive Interaction) and I'd be interested to hear…</span></p>
<p>I was wondering if anyone here has much experience in building econometric models - specifically in calculating diminishing returns as there are tonnes of different ways to go about this. For simplicity, I have previously used <span>an exponential decay (e to the power of -(a.x) where a is the rate of diminishing returns and x is the rate of media spend - but there are many other ways to model this (e.g. Linear log models, Multiplicative Competitive Interaction) and I'd be interested to hear of people's experiences as to which of these have worked well.</span></p>
<p></p> Recommendation on a data visualization booktag:www.datasciencecentral.com,2019-09-30:6448529:Topic:8925312019-09-30T05:28:07.985ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>I was looking for the best <font style="background-color: #ffffff;">data visualization book</font> that I should have. Any recommendations? Thanks in advance</p>
<p>I was looking for the best <font style="background-color: #ffffff;">data visualization book</font> that I should have. Any recommendations? Thanks in advance</p> Optimization algotag:www.datasciencecentral.com,2019-09-26:6448529:Topic:8912922019-09-26T11:30:04.542ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>Hi all. <br></br>Reading an article about ELO rank I have a question. The probability of the "A" team win is a sigmoid function like 1 / (1 + exp( RankB -RankA))<br></br>and after the game we need to update these ranks like Rank_new = Rank_old +- K*(1(0) - probability) <br></br><br></br>So the main question is how I can use for example NN( or other algo) for finding "K" parameter to make binary crossentropy minimum. And I hope it musn't be constant (I want to find dependent from the initial player…</p>
<p>Hi all. <br/>Reading an article about ELO rank I have a question. The probability of the "A" team win is a sigmoid function like 1 / (1 + exp( RankB -RankA))<br/>and after the game we need to update these ranks like Rank_new = Rank_old +- K*(1(0) - probability) <br/><br/>So the main question is how I can use for example NN( or other algo) for finding "K" parameter to make binary crossentropy minimum. And I hope it musn't be constant (I want to find dependent from the initial player rating) <br/><br/>The main my problem that I can't understand is that we need after updating parameters use a new input rank for calculating probability. So every epoch we need to update input <br/><br/><br/></p> Insight in datatag:www.datasciencecentral.com,2019-09-18:6448529:Topic:8892032019-09-18T08:58:11.152ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>I have a situation with a client. They have 4 sources of data and they are wanting to create a single metric out of these four values to gain a generalised insight into how the company is going overall.</p>
<p></p>
<p>The problem is that each source has a completely different scale and are not really comparable.</p>
<p></p>
<p>Source A has a scale in the millions where as Source B's scale is in the hundreds.</p>
<p></p>
<p>Further to this we wanted to weight each source as some provide more…</p>
<p>I have a situation with a client. They have 4 sources of data and they are wanting to create a single metric out of these four values to gain a generalised insight into how the company is going overall.</p>
<p></p>
<p>The problem is that each source has a completely different scale and are not really comparable.</p>
<p></p>
<p>Source A has a scale in the millions where as Source B's scale is in the hundreds.</p>
<p></p>
<p>Further to this we wanted to weight each source as some provide more value than others.</p>
<p></p>
<p>We decided to scale all four between 0 and 1 using this formula</p>
<p><span><span class="mrow" id="MathJax-Span-24"><span class="msubsup" id="MathJax-Span-25"><span class="mi" id="MathJax-Span-26">z</span><span class="mi" id="MathJax-Span-27">i</span></span><span class="mo" id="MathJax-Span-28">= </span><span class="mfrac" id="MathJax-Span-29"><span class="mrow" id="MathJax-Span-30"><span class="msubsup" id="MathJax-Span-31"><span class="mi" id="MathJax-Span-32">x</span><span class="mi" id="MathJax-Span-33">i</span></span><span class="mo" id="MathJax-Span-34">− </span><span class="mo" id="MathJax-Span-35">min</span><span class="mo" id="MathJax-Span-36">(</span><span class="mi" id="MathJax-Span-37">x</span><span class="mo" id="MathJax-Span-38">) / </span></span><span class="mrow" id="MathJax-Span-39"><span class="mo" id="MathJax-Span-40">max</span><span class="mo" id="MathJax-Span-41">(</span><span class="mi" id="MathJax-Span-42">x</span><span class="mo" id="MathJax-Span-43">)</span><span class="mo" id="MathJax-Span-44">−</span><span class="mo" id="MathJax-Span-45">min</span><span class="mo" id="MathJax-Span-46">(</span><span class="mi" id="MathJax-Span-47">x</span><span class="mo" id="MathJax-Span-48">)</span></span></span></span></span></p>
<p>and while its works I am confused as to what insight I can get out of the numbers.</p>
<p></p>
<p>Here is the google sheet I am preparing with</p>
<p><a href="https://docs.google.com/spreadsheets/d/1Eua7tmqD3B0l3M04QnXDcU5HCAmFIfP65lsA7l52604/edit?usp=sharing">https://docs.google.com/spreadsheets/d/1Eua7tmqD3B0l3M04QnXDcU5HCAmFIfP65lsA7l52604/edit?usp=sharing</a></p>
<p></p>
<p>If you look at cell H14 and H15 can you say that March was 3 times worse than Feb because the March score was 1.1 and the Feb was 3.2?</p>
<p></p>
<p>Thanks in advance</p>
<p></p>
<p></p>
<p></p>
<p></p> Cleaning responses to meet quotas after samplingtag:www.datasciencecentral.com,2019-09-15:6448529:Topic:8881302019-09-15T16:48:06.110ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>I know that usually survey sampling is done the way that after a quota is reached, the survey is closed for respondents that would meet the criteria for that quota.</p>
<p>However, at the company I work at, the survey is open for everyone until every demographic quota is met; and only after that do we start deleting responses until the quotas are met. So for example if we need 500 cases (250 females and 250 males) and we closed the survey with 532 responses that have 273 females and 259…</p>
<p>I know that usually survey sampling is done the way that after a quota is reached, the survey is closed for respondents that would meet the criteria for that quota.</p>
<p>However, at the company I work at, the survey is open for everyone until every demographic quota is met; and only after that do we start deleting responses until the quotas are met. So for example if we need 500 cases (250 females and 250 males) and we closed the survey with 532 responses that have 273 females and 259 males, we delete 23 female and 9 male responses. It sounds easy, but most studies have 3-4 demographic quotas (e.g. gender, age group, region, settlement type), and it is really difficult and time-consuming to figure out what cases I have to delete to meet the quotas.</p>
<p>Is there any way or software that would calculate automatically what cases should be deleted?</p> Predictive Analysistag:www.datasciencecentral.com,2019-09-08:6448529:Topic:8854932019-09-08T17:06:10.864ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>Hi Team,</p>
<p>I have started learning and practicing Data Science, and now i feel i am now ok till data cleaning.</p>
<p>Now I want to learning the basics and techniques for Predicting based on the data-set which we have cleaned so far.</p>
<p>Any lead on this will be very helpful.</p>
<p>Also when i am searching for Predictive Analysis, i am multiple time coming across of Test Data / Train Data... but yet I am not clear on this concept. And which tool i can use to predict the data.</p>
<p>Hi Team,</p>
<p>I have started learning and practicing Data Science, and now i feel i am now ok till data cleaning.</p>
<p>Now I want to learning the basics and techniques for Predicting based on the data-set which we have cleaned so far.</p>
<p>Any lead on this will be very helpful.</p>
<p>Also when i am searching for Predictive Analysis, i am multiple time coming across of Test Data / Train Data... but yet I am not clear on this concept. And which tool i can use to predict the data.</p> Passing Nan values to ML Algorithmtag:www.datasciencecentral.com,2019-08-28:6448529:Topic:8790352019-08-28T02:44:20.579ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>Suppose I have 10 independent variable, and I intentionally didn't remove the nan value from few of my independent variable and move it to numpy and then passed it to the ML algo. Will few of the algo give error. Can ML also such as (Decision tree, SVM etc) can handle nan value. If these ML algo doesn't give error, then how will these nan values will be treated/handled internally by the algo.</p>
<p>Suppose I have 10 independent variable, and I intentionally didn't remove the nan value from few of my independent variable and move it to numpy and then passed it to the ML algo. Will few of the algo give error. Can ML also such as (Decision tree, SVM etc) can handle nan value. If these ML algo doesn't give error, then how will these nan values will be treated/handled internally by the algo.</p> Input to PCAtag:www.datasciencecentral.com,2019-08-27:6448529:Topic:8786912019-08-27T16:16:21.259ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>Suppose I have 20 independent vsariables and I am thinking to go for PCA, Do we need to do the scaling of all these 20 independent variable, or PCA will handle it... And I hope the output of PCA will be scaled features...</p>
<p>Suppose I have 20 independent vsariables and I am thinking to go for PCA, Do we need to do the scaling of all these 20 independent variable, or PCA will handle it... And I hope the output of PCA will be scaled features...</p> Best practice during Data preparationtag:www.datasciencecentral.com,2019-08-23:6448529:Topic:8720542019-08-23T09:40:09.653ZDonna Peryginhttps://www.datasciencecentral.com/profile/DonnaPerygin
<p>Hello All,</p>
<p>I am new to Data Science,I wanted to know what are the Best Practice during Data Preparation.</p>
<p>Like Converting Integer into Category.Is it good practice to Categorize the Data.</p>
<p>e.g Data Contains Age column.Is it good practice to Club Age into different Category.</p>
<p>Please let me know if there are any article which i can refer for the same.</p>
<p></p>
<p>Thanks</p>
<p>Nitish</p>
<p>Hello All,</p>
<p>I am new to Data Science,I wanted to know what are the Best Practice during Data Preparation.</p>
<p>Like Converting Integer into Category.Is it good practice to Categorize the Data.</p>
<p>e.g Data Contains Age column.Is it good practice to Club Age into different Category.</p>
<p>Please let me know if there are any article which i can refer for the same.</p>
<p></p>
<p>Thanks</p>
<p>Nitish</p>