Andrea Manero-Bastin's Posts - Data Science Central2020-08-14T14:58:32ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastinhttps://storage.ning.com/topology/rest/1.0/file/get/2802430294?profile=RESIZE_48X48&width=48&height=48&crop=1%3A1https://www.datasciencecentral.com/profiles/blog/feed?user=25qyadn1luynh&xn_auth=noTop 6 Essential Skills for Data Scientiststag:www.datasciencecentral.com,2020-08-01:6448529:BlogPost:9642402020-08-01T13:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by </i><span><i>Ronald Van Loon.</i></span></p>
<p><a href="https://storage.ning.com/topology/rest/1.0/file/get/7293003671?profile=original" rel="noopener" target="_blank"><img class="align-center" src="https://storage.ning.com/topology/rest/1.0/file/get/7293003671?profile=RESIZE_710x"></img></a></p>
<p>Ronald lists the 6 following skills:</p>
<ul>
<li>Programming - <a href="https://www.datasciencecentral.com/page/search?q=python" rel="noopener" target="_blank">Python</a>, Perl, R, C/C++, SQL, and Java </li>
<li>Knowledge of SaS and Other Analytical Tools - Hadoop, Spark,…</li>
</ul>
<p><i>This article was written by </i><span><i>Ronald Van Loon.</i></span></p>
<p><a href="https://storage.ning.com/topology/rest/1.0/file/get/7293003671?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/7293003671?profile=RESIZE_710x" class="align-center"/></a></p>
<p>Ronald lists the 6 following skills:</p>
<ul>
<li>Programming - <a href="https://www.datasciencecentral.com/page/search?q=python" target="_blank" rel="noopener">Python</a>, Perl, R, C/C++, SQL, and Java </li>
<li>Knowledge of SaS and Other Analytical Tools - Hadoop, Spark, Hive, Pig, Tableau</li>
<li>Adept at Working with <a href="https://www.datasciencecentral.com/profiles/blogs/5-easy-steps-to-structure-highly-unstructured-big-data" target="_blank" rel="noopener">Unstructured Data</a></li>
<li>A Strong Business Acumen</li>
<li>Strong Communication Skills</li>
<li>Great <a href="https://www.datasciencecentral.com/profiles/blogs/insight-driven-vs-intuition-driven-decision-making" target="_blank" rel="noopener">Data Intuition</a></li>
</ul>
<p><i>To read the rest of the article, click <a href="https://www.simplilearn.com/what-skills-do-i-need-to-become-a-data-scientist-article" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Milestones of Deep Learningtag:www.datasciencecentral.com,2020-07-26:6448529:BlogPost:9636052020-07-26T17:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://towardsdatascience.com/@Thimira?source=post_page-----1aaa9aef5b18----------------------" rel="noopener" target="_blank">Thimira Amaratunga</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Deep Learning has been around for about a decade now. Since its inception, Deep Learning has taken the world by storm due to its success (See my article “What is Deep Learning?” on how Deep Learning evolved through Artificial Intelligence, and Machine…</span></p>
<p><i>This article was written by <a href="https://towardsdatascience.com/@Thimira?source=post_page-----1aaa9aef5b18----------------------" target="_blank" rel="noopener">Thimira Amaratunga</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Deep Learning has been around for about a decade now. Since its inception, Deep Learning has taken the world by storm due to its success (See my article “What is Deep Learning?” on how Deep Learning evolved through Artificial Intelligence, and Machine Learning). Here are some of the more significant achievements of Deep Learning throughout the years.</span></p>
<p></p>
<p><span><i><a href="https://miro.medium.com/max/400/0*Bzf0fWk60FBsxCtf.PNG" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/400/0*Bzf0fWk60FBsxCtf.PNG?profile=RESIZE_710x" class="align-full"/></a></i></span></p>
<p></p>
<p><span>Table of contents:</span></p>
<ul>
<li><span>AlexNet — 2012</span></li>
<li><span>ZF Net — 2013</span></li>
<li><span>VGG Net — 2014</span></li>
<li><span>GoogLeNet — 2014/2015</span></li>
<li><span>Microsoft ResNet — 2015</span></li>
<li><span>Is Deep Learning just CNNs?</span></li>
</ul>
<p></p>
<p><i>To read the whole article, with each achievement of Deep Learning detailed and illustrated, click <a href="https://towardsdatascience.com/milestones-of-deep-learning-1aaa9aef5b18" target="_blank" rel="noopener">here</a>. For other deep learning articles, <a href="https://www.datasciencecentral.com/page/search?q=deep+learning" target="_blank" rel="noopener">follow this link</a>. </i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>The Data Science Imposter syndrometag:www.datasciencecentral.com,2020-07-10:6448529:BlogPost:9618552020-07-10T14:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.linkedin.com/in/marcuos/" rel="noopener" target="_blank">Marcus Oliveira da Silva.</a></i></p>
<p></p>
<p><span><i><a href="https://e2eml.school/images/ewok.jpg" rel="noopener" target="_blank"><img class="align-full" src="https://e2eml.school/images/ewok.jpg?profile=RESIZE_710x"></img></a></i></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>I am not a real data scientist.</b></span></p>
<p><span>I have never used a deep learning framework, like TensorFlow or Keras.</span></p>
<p><span>I have never touched a…</span></p>
<p><i>This article was written by <a href="https://www.linkedin.com/in/marcuos/" target="_blank" rel="noopener">Marcus Oliveira da Silva.</a></i></p>
<p></p>
<p><span><i><a href="https://e2eml.school/images/ewok.jpg" target="_blank" rel="noopener"><img src="https://e2eml.school/images/ewok.jpg?profile=RESIZE_710x" class="align-full"/></a></i></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>I am not a real data scientist.</b></span></p>
<p><span>I have never used a deep learning framework, like TensorFlow or Keras.</span></p>
<p><span>I have never touched a GPU.</span></p>
<p><span>I don’t have a degree in computer science or statistics. My degree is in mechanical engineering, of all things.</span></p>
<p><span>I don't know R.</span></p>
<p><span>But I haven’t given up hope. After reading a bunch of job postings, I figured out that all it will take to become a real data scientist is five PhD's and 87 years of job experience.</span></p>
<p><span>If this sounds familiar, know that you are not alone. You are not the only one who wonders how much longer they can get away with pretending to be a data scientist. You are not the only one who has nightmares about being laughed out of your next interview.</span></p>
<p><span>Imposter syndrome is feeling like everyone else in your field is more qualified than you are, that you will never get hired or, if you already have been, that you are a mistake of the hiring process. </span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>What a real data scientist looks like</b></span></p>
<p></p>
<p><span>A good generalist</span></p>
<ul>
<li><span>is superficially familiar with every part of data science,</span></li>
<li><span>recognizes all the jargon and technical terms,</span></li>
<li><span>has a good notion of what tools and expertise are needed to solve a given problem, and</span></li>
<li><span>asks insightful questions in technical reviews.</span></li>
</ul>
<p><span>A good specialist</span></p>
<ul>
<li><span>understands one area deeply,</span></li>
<li><span>can explain their area of expertise to non-experts,</span></li>
<li><span>understands the tradeoffs between different approaches,</span></li>
<li><span>is up to date on current research and new tools, and</span></li>
<li><span>can use their tools quickly to produce high-quality results.</span></li>
</ul>
<p><i>To read the rest of the article, click <a href="https://e2eml.school/imposter_syndrome.html" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>My First Neural Networktag:www.datasciencecentral.com,2020-07-10:6448529:BlogPost:9617882020-07-10T14:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.linkedin.com/in/bharat-girdhar-b314b119/?lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3BeMFGd%2BvUSe6BxS410pvziQ%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_pulse_read-read_profile" rel="noopener" target="_blank">Bharat Girdhar</a></i><span><i>.</i></span></p>
<p></p>
<p><span>I was always intrigued by the concept of computers taking a decision on behalf of humans. Though the concept of machine learning has been there for decades but…</span></p>
<p><i>This article was written by <a href="https://www.linkedin.com/in/bharat-girdhar-b314b119/?lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3BeMFGd%2BvUSe6BxS410pvziQ%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_pulse_read-read_profile" target="_blank" rel="noopener">Bharat Girdhar</a></i><span><i>.</i></span></p>
<p></p>
<p><span>I was always intrigued by the concept of computers taking a decision on behalf of humans. Though the concept of machine learning has been there for decades but mostly with researchers and practitioners.</span></p>
<p><span>The ever evolving IT Industry is changing rapidly (at least that’s what I have been reading from last 6 months) and with Automation (especially robotics) spreading its wings up skilling is a MUST for everyone.</span></p>
<p><span>Automation will soon conquer all the segments and that will increase the need for access to highly skilled talent. With this in mind I started my voyage with Machine learning. Searching on Google will give you thousands of links but what I found interesting was ‘Coursera’ and the ‘Machine learning’ course.</span></p>
<p></p>
<p><span><a href="https://media-exp1.licdn.com/dms/image/C5112AQGvWLi4-7nS_A/article-inline_image-shrink_1000_1488/0?e=1599696000&v=beta&t=wMPWb3FXZ5uioGpMhNSBTV32lQwSEf-yF10VkVEz9Tw" target="_blank" rel="noopener"><img src="https://media-exp1.licdn.com/dms/image/C5112AQGvWLi4-7nS_A/article-inline_image-shrink_1000_1488/0?e=1599696000&v=beta&t=wMPWb3FXZ5uioGpMhNSBTV32lQwSEf-yF10VkVEz9Tw&profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span>There is a lot of good information about ML (Machine learning) in the course but one that specially caught my attention was Neural Networks. Course content compared Neural network working with 'Human Brains' and this popped a lot of questions in my head:</span></p>
<p><span>Is this true? How can a neural network learn and work like our brains? How computers work like our brains? Don’t they work on commands coded by us?</span></p>
<p><span>My quest lead me to read about our Brains and Neural Network.</span></p>
<p><span>As per my understanding, when we are born our brains are like blank slates and it evolves based on the experience and learning. Imagine when we see the digit 1 first time, do we know its 1 and what does it mean? No, we don't. Over a period of time we get exposed to these numbers again and again with our teachers / parents helping us understand how to recognize each digit and whenever we made a mistake they corrected us (this part of correcting us every time we made a mistake is very important).</span></p>
<p><span>The reason I am able to differentiate between numbers say 1 and 2 is because I have been taught over and over that if there is a straight vertical line without any extension its 1. A straight vertical line means a dark/light color line (as compared to background) going from top to bottom.</span></p>
<p></p>
<p><i>To read the rest of the article, with a detailed example, click <a href="https://www.linkedin.com/pulse/my-first-neural-network-bharat-girdhar/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Adding Uncertainty to Deep Learningtag:www.datasciencecentral.com,2020-06-30:6448529:BlogPost:9605542020-06-30T13:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://towardsdatascience.com/@plusepsilon?source=post_page-----ecc2401f2013----------------------" rel="noopener" target="_blank">Motoki Wu</a></i><span><i>. Full title: How to construct prediction intervals for deep learning models using Edward and TensorFlow.…</i></span></p>
<p><span><i><a href="https://storage.ning.com/topology/rest/1.0/file/get/7134766685?profile=original" rel="noopener" target="_blank"><img class="align-center" src="https://storage.ning.com/topology/rest/1.0/file/get/7134766685?profile=RESIZE_710x"></img></a></i></span></p>
<p><i>This article was written by <a href="https://towardsdatascience.com/@plusepsilon?source=post_page-----ecc2401f2013----------------------" target="_blank" rel="noopener">Motoki Wu</a></i><span><i>. Full title: How to construct prediction intervals for deep learning models using Edward and TensorFlow.</i></span></p>
<p><span><i><a href="https://storage.ning.com/topology/rest/1.0/file/get/7134766685?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/7134766685?profile=RESIZE_710x" class="align-center"/></a></i></span></p>
<p style="text-align: center;"><em>Edward and Tensorflow. Source: <a href="https://people.duke.edu/~ccc14/sta-663-2017/21_TensorFlow_Edward.html" target="_blank" rel="noopener">here</a> (Duke University)</em></p>
<p>The difference between statistical modeling and machine learning gets blurry by the day. They both learn from data and predict an outcome. The main distinction seems to come from the existence of uncertainty estimates. Uncertainty estimates allow hypothesis testing, though usually at the expense of scalability.</p>
<p></p>
<p> Machine Learning = Statistical Modeling - Uncertainty + Data</p>
<p></p>
<p><span>Ideally, we mesh the best of both worlds by adding uncertainty to machine learning. Recent developments in variational inference (VI) and deep learning (DL) make this possible (also called Bayesian deep learning). What’s nice about VI is that it scales well with data size and fits nicely with DL frameworks that allow model composition and stochastic optimization.</span></p>
<p><span>An added benefit to adding uncertainty to models is that it promotes model-based machine learning. In machine learning, the results of the predictions are what you base your model on. If the results are not up to par, the strategy is to “throw data at the problem”, or “throw models at the problem”, until satisfactory. In model-based (or Bayesian) machine learning, you are forced to specify the probability distributions for the data and parameters. The idea is to explicitly specify the model first, and then check on the results (a distribution which is richer than a point estimate).</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Bayesian Linear Regression</b></span></p>
<p><span>Here is an example adding uncertainty to a simple linear regression model. A simple linear regression predicts labels Y given data X with weights w.</span></p>
<p></p>
<p><span> Y = w * X</span></p>
<p></p>
<p><span>The goal is to find a value for unknown parameter w by minimizing a loss function.</span></p>
<p></p>
<p><span> (Y - w * X)²</span></p>
<p></p>
<p><span>Let’s flip this into a probability. If you assume that Y is a Gaussian distribution, the above is equivalent to maximizing the following data likelihood with respect to w:</span></p>
<p></p>
<p><span> p(Y | X, w)</span></p>
<p></p>
<p><span>So far this is traditional machine learning. To add uncertainty to your weight estimates and turn it into a Bayesian problem, it’s as simple as attaching a prior distribution to the original model.</span></p>
<p></p>
<p><span> p(Y | X, w) * p(w)</span></p>
<p></p>
<p><span>Notice this is equivalent to inverting the probability of the original machine learning problem via Bayes Rule:</span></p>
<p></p>
<p><span> p(w | X, Y) = p(Y | X, w) * p(w) / CONSTANT</span></p>
<p></p>
<p><span>The probability of the weights (w) given the data is what we need for uncertainty intervals. This is the posterior distribution of weight w.</span></p>
<p><span>Although adding a prior is simple conceptually, the computation is often intractible; namely, the CONSTANT is a big, bad integral.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Monte Carlo Integration</b></span></p>
<p><span>An approximation of the integral of a probability distribution is usually done by sampling. Sampling the distribution and averaging will get an approximation of the expected value (also called Monte Carlo integration). So let’s reformulate the integral problem into an expectation problem.</span></p>
<p><span>The CONSTANT above integrates out the weights from the joint distribution between the data and weights.</span></p>
<p></p>
<p><span> CONSTANT = ∫ p(x, w) dw</span></p>
<p></p>
<p><span>To reformulate it into an expectation, introduce another distribution, q, and take the expectation according to q.</span></p>
<p></p>
<p><span> ∫ p(x, w) q(w) / q(w) dw = E[ p(x, w) / q(w) ]</span></p>
<p></p>
<p><span>We choose a q distribution so that it’s easy to sample from. Sample a bunch of w from q and take the sample mean to get the expectation.</span></p>
<p></p>
<p><span> E[ p(x, w) / q(w) ] ≈ sample mean[ p(x, w) / q(w) ]</span></p>
<p></p>
<p><span>This idea we’ll use later for variational inference.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Variational Inference</b></span></p>
<p><span>The idea of variational inference is that you can introduce a variational distribution, q, with variational parameters, v, and turn it into an optimization problem. The distribution q will approximate the posterior.</span></p>
<p></p>
<p><span> q(w | v) ≈ p(w | X, Y)</span></p>
<p></p>
<p><span>These two distributions need to be close, a natural approach would minimize the difference between them. It’s common to use the Kullback-Leibler divergence (KL divergence) as a difference (or variational) function.</span></p>
<p></p>
<p><span> KL[q || p] = E[ log (q / p) ]</span></p>
<p></p>
<p><span>The KL divergence can be decomposed to the data distribution and the evidence lower bound (ELBO).</span></p>
<p></p>
<p><span> KL[q || p] = CONSTANT - ELBO</span></p>
<p></p>
<p><span>The CONSTANT can be ignored it since it’s not dependent on q. Intuitively, the denominators of q and p cancel out and you’re left with the ELBO. Now we only need to optimize over the ELBO.</span></p>
<p><span>The ELBO is just the original model with the variational distribution.</span></p>
<p></p>
<p><span> ELBO = E[ log p(Y | X, w)*p(w) - log q(w | v) ]</span></p>
<p></p>
<p><span>To obtain the expectation over q, Monte Carlo integration is used (sample and take the mean).</span></p>
<p><span>In deep learning, it’s common to use stochastic optimization to estimate the weights. For each minibatch, we take the average of the loss function to obtain the stochastic estimate of the gradient. Similarly, any DL framework that has automatic differentiation can estimate the ELBO as the loss function. The only difference is you sample from q and the average will be a good estimate of the expectation and then the gradient.</span></p>
<p></p>
<p><i>To read the rest of the article with source code and computation of prediction intervals, click <a href="https://towardsdatascience.com/adding-uncertainty-to-deep-learning-ecc2401f2013" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Deep Learning for Object Detection: A Comprehensive Reviewtag:www.datasciencecentral.com,2020-06-15:6448529:BlogPost:9578022020-06-15T09:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://towardsdatascience.com/@joycex99?source=post_page-----73930816d8d9----------------------" rel="noopener" target="_blank">Joyce Xu</a></i><span><i>.</i></span></p>
<p></p>
<p><span>With the rise of autonomous vehicles, smart video surveillance, facial detection and various people counting applications, fast and accurate object detection systems are rising in demand. These systems involve not only recognizing and classifying every object in an…</span></p>
<p><i>This article was written by <a href="https://towardsdatascience.com/@joycex99?source=post_page-----73930816d8d9----------------------" target="_blank" rel="noopener">Joyce Xu</a></i><span><i>.</i></span></p>
<p></p>
<p><span>With the rise of autonomous vehicles, smart video surveillance, facial detection and various people counting applications, fast and accurate object detection systems are rising in demand. These systems involve not only recognizing and classifying every object in an image, but localizing each one by drawing the appropriate bounding box around it. This makes object detection a significantly harder task than its traditional computer vision predecessor, image classification.</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/2000/1*ftTEVgsx0jfvUSFB6X5mQg.jpeg" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/2000/1*ftTEVgsx0jfvUSFB6X5mQg.jpeg?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span>Fortunately, however, the most successful approaches to object detection are currently extensions of image classification models. A few months ago, Google released a new object detection API for Tensorflow. With this release came the pre-built architectures and weights for a few specific models:</span></p>
<ul>
<li><span>Single Shot Multibox Detector (SSD) with MobileNets</span></li>
<li><span>SSD with Inception V2</span></li>
<li><span>Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101</span></li>
<li><span>Faster RCNN with Resnet 101</span></li>
<li><span>Faster RCNN with Inception Resnet v2</span></li>
</ul>
<p><span>In my last blog post, I covered the intuition behind the three base network architectures listed above: MobileNets, Inception, and ResNet. This time around, I want to do the same for Tensorflow’s object detection models: Faster R-CNN, R-FCN, and SSD. By the end of this post, we will hopefully have gained an understanding of how deep learning is applied to object detection, and how these object detection models both inspire and diverge from one another.</span></p>
<p id="ac21" class="kb kc fr bj kd b ke ls kg kh lt kj kk lu km kn lv kp kq lw ks kt gp">Faster R-CNN, R-FCN, and SSD are three of the best and most widely used object detection models out there right now. Other popular models tend to be fairly similar to these three, all relying on deep CNN’s (read: ResNet, Inception, etc.) to do the initial heavy lifting and largely following the same proposal/classification pipeline.</p>
<p id="8dec" class="kb kc fr bj kd b ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt gp">At this point, putting these models to use just requires knowing Tensorflow’s API.<span> </span></p>
<p><i>To read the whole article, with each point detailed and illustrations, click <a href="https://towardsdatascience.com/deep-learning-for-object-detection-a-comprehensive-review-73930816d8d9" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>Logistic Regression with Mathtag:www.datasciencecentral.com,2020-06-15:6448529:BlogPost:9578012020-06-15T08:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://medium.com/@madhusanjeevi.ai?source=post_page-----e9cbb3ec6077----------------------" rel="noopener" target="_blank">Madhu Sanjeevi (Mady)</a></i><span><i>.</i></span></p>
<p></p>
<p><span>In the previous story we talked about Linear Regression for solving regression problems in machine learning, This story we will talk about Logistic Regression for classification problems.</span></p>
<p><span>You may be wondering why the name says regression…</span></p>
<p><i>This article was written by <a href="https://medium.com/@madhusanjeevi.ai?source=post_page-----e9cbb3ec6077----------------------" target="_blank" rel="noopener">Madhu Sanjeevi (Mady)</a></i><span><i>.</i></span></p>
<p></p>
<p><span>In the previous story we talked about Linear Regression for solving regression problems in machine learning, This story we will talk about Logistic Regression for classification problems.</span></p>
<p><span>You may be wondering why the name says regression if it is a classification algorithm, well,It uses the regression inside to be the classification algorithm.</span></p>
<p><span>Classification : Separates the data from one to another. </span><span>This story we talk about binary classification ( 0 or 1). Here target variable is either 0 or 1. </span><span>Goal is to find that green straight line (which separates the data at best). </span><span>So we use regression for drawing the line, makes sense right?</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/1390/1*639EfjXxfJqL9wTwIy7aNg.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/1390/1*639EfjXxfJqL9wTwIy7aNg.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>We only accept the values between 0 and 1 (We don’t accept other values) to make a decision (Yes/No). </span><span>There is an awesome function called Sigmoid or Logistic function, we use to get the values between 0 and 1. </span><span>This function squashes the value (any value) and gives the value between 0 and 1.</span></p>
<p><span>So far we know that we first apply the linear equation and apply Sigmoid function for the result so we get the value which is between 0 and 1. </span><span>The hypothesis for Linear regression is h(X) = θ0+θ1*X.</span></p>
<p><strong>How does it work?</strong></p>
<ol>
<li><span>First we calculate the Logit function: </span><span>logit = θ0+θ1*X (hypothesis of linear regression)</span></li>
<li><span>We apply the above Sigmoid function (Logistic function) to logit.</span></li>
<li><span>We calculate the error, Cost function (Maximum log-Likelihood).</span></li>
<li><span>Next step is to apply Gradient descent to change the θ values in our hypothesis.</span></li>
</ol>
<p><span>We got the Logistic regression ready, we can now predict new data with the model we just built.</span></p>
<p><i>To read the whole article, with examples and illustrations, click <a href="https://medium.com/deep-math-machine-learning-ai/chapter-2-0-logistic-regression-with-math-e9cbb3ec6077" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Tutorial: Counting Road Traffic Capacity with OpenCVtag:www.datasciencecentral.com,2020-06-09:6448529:BlogPost:9569062020-06-09T20:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://medium.com/@a.nikishaev?source=post_page-----998580f1fbde----------------------" rel="noopener" target="_blank">Andrey Nikishaev</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Today I will show you very simple but powerful example of how to count traffic capacity with the algorithm that you can run on devices.</span></p>
<p></p>
<p><span>So this algorithm works in 4 steps:</span></p>
<p><span>1. Get frame edges.</span></p>
<p><span>2. Blur…</span></p>
<p><i>This article was written by <a href="https://medium.com/@a.nikishaev?source=post_page-----998580f1fbde----------------------" target="_blank" rel="noopener">Andrey Nikishaev</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Today I will show you very simple but powerful example of how to count traffic capacity with the algorithm that you can run on devices.</span></p>
<p></p>
<p><span>So this algorithm works in 4 steps:</span></p>
<p><span>1. Get frame edges.</span></p>
<p><span>2. Blur them to get the more filled area.</span></p>
<p><span>3. Binary threshold blurred the image.</span></p>
<p><span>4. Overlap threshold image with ROI(you mask where you count) and count black pixels/white pixels which gives you traffic capacity.</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/1400/1*_Uoe4paVkOgiy1oytxcotQ.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/1400/1*_Uoe4paVkOgiy1oytxcotQ.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span><b>Edges</b></span></p>
<p><span>Here we use CLAHE equalization to remove noise from the image that can occur on cheap/old cameras at night. It not the best thing, but gives a better result.</span></p>
<p><span>Then we use Canny Edge Detector to get edges from the image. We invert it to get white background (just for visual convenient).</span></p>
<p></p>
<p><span><b>Blur</b></span></p>
<p><span>We use basic blur with bilateral filtering which removes some color noise and gives better segmentation.</span></p>
<p></p>
<p><span><b>Threshold</b></span></p>
<p><span>The last filter is a binary threshold which we use to get only white and black pixels which give as our segmentation on car/not car.</span></p>
<p></p>
<p><span><b>Counting</b></span></p>
<p><span>And the last simple step just divides the number of black pixels with the number of white pixels to get traffic capacity.</span></p>
<p></p>
<p><span><i>To read the rest of the article, with illustrations,</i></span> <span><i>click <a href="https://medium.com/machine-learning-world/tutorial-counting-road-traffic-capacity-with-opencv-998580f1fbde" target="_blank" rel="noopener">here</a>.</i></span></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>How to remove duplicates in large datasetstag:www.datasciencecentral.com,2020-05-05:6448529:BlogPost:9500192020-05-05T20:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://clevertap.com/blog/category/data-science/" rel="noopener" target="_blank">Suresh Kondamudi</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Dealing with large datasets is often daunting. With limited computing resources, particularly memory, it can be challenging to perform even basic tasks like counting distinct elements, membership check, filtering duplicate elements, finding minimum, maximum, top-n elements, or set operations like union,…</span></p>
<p><i>This article was written by <a href="https://clevertap.com/blog/category/data-science/" target="_blank" rel="noopener">Suresh Kondamudi</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Dealing with large datasets is often daunting. With limited computing resources, particularly memory, it can be challenging to perform even basic tasks like counting distinct elements, membership check, filtering duplicate elements, finding minimum, maximum, top-n elements, or set operations like union, intersection, similarity and so on.</span></p>
<p></p>
<p><span><b><a href="https://d35fo82fjcw0y8.cloudfront.net/2016/03/03210600/bloom-filter.jpg" target="_blank" rel="noopener"><img src="https://d35fo82fjcw0y8.cloudfront.net/2016/03/03210600/bloom-filter.jpg?profile=RESIZE_710x" class="align-full"/></a></b></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Probabilistic data structures to the rescue</b></span></p>
<p><span>Probabilistic data structures can come in pretty handy in these cases, in that they dramatically reduce memory requirements, while still providing acceptable accuracy. Moreover, you get time efficiencies, as lookups (and adds) rely on multiple independent hash functions, which can be parallelized. We use structures like Bloom filters, MinHash, Count-min sketch, HyperLogLog extensively to solve a variety of problems. One fairly straightforward example is presented below.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>The problem</b></span></p>
<p><span>We manage mobile push notifications for our customers, and one of the things we need to guard against is sending multiple notifications to the same user for the same campaign. Push notifications are routed to individual devices/users based on push notification tokens generated by the mobile platforms. Because of their size (anywhere from 32b to 4kb), it’s non-performant for us to index push tokens or use them as the primary user key.</span></p>
<p><span>On certain mobile platforms, when a user uninstalls and subsequently re-installs the same app, we lose our primary user key and create a new user profile for that device. Typically, in that case, the mobile platform will generate a new push notification token for that user on the reinstall. However, that is not always guaranteed. So, in a small number of cases we can end up with multiple user records in our system having the same push notification token.</span></p>
<p><span>As a result, to prevent sending multiple notifications to the same user for the same campaign, we need to filter for a relatively small number of duplicate push tokens from a total dataset that runs from hundreds of millions to billions of records. To give you a sense of proportion, the memory required to filter just 100 Million push tokens is 100M * 256 = 25 GB!</span></p>
<p></p>
<p><i>To read the rest of the article, with the solution, click <a href="https://clevertap.com/blog/how-to-remove-duplicates-in-large-datasets/" target="_blank" rel="noopener">here</a>. For another approach to this problem, see <a href="https://www.datasciencecentral.com/forum/topics/40-year-old-trick-to-clean-data-efficiently" target="_blank" rel="noopener">here</a>. </i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>A Data Scientist’s Perspective on Microsoft Rtag:www.datasciencecentral.com,2020-05-05:6448529:BlogPost:9497982020-05-05T19:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.linkedin.com/in/lixunzhang/" rel="noopener" target="_blank">Lixun Zhang</a></i><span><i>.</i></span></p>
<p></p>
<p><span>As a data scientist, I have experience with</span> <span>R</span><span>. Naturally, when I was first exposed to Microsoft R Open (MRO, formerly Revolution R Open) and Microsoft R Server (MRS, formerly Revolution R Enterprise), I wanted to know the answers for 3 questions:</span></p>
<ul>
<li><span>What do R, MRO, and MRS…</span></li>
</ul>
<p><i>This article was written by <a href="https://www.linkedin.com/in/lixunzhang/" target="_blank" rel="noopener">Lixun Zhang</a></i><span><i>.</i></span></p>
<p></p>
<p><span>As a data scientist, I have experience with</span> <span>R</span><span>. Naturally, when I was first exposed to Microsoft R Open (MRO, formerly Revolution R Open) and Microsoft R Server (MRS, formerly Revolution R Enterprise), I wanted to know the answers for 3 questions:</span></p>
<ul>
<li><span>What do R, MRO, and MRS have in common?</span></li>
<li><span>What’s new in MRO and MRS compared with R?</span></li>
<li><span>Why should I use MRO or MRS instead of R?</span></li>
</ul>
<p><span>The publicly available information on MRS either describes it at a high level or explains the specific functions and the underlying algorithms. When they compare R, MRO, and MRS, the materials tend to be high level without many details at the functions and packages level, with which data scientists are most familiar. And they don’t answer the above questions in a comprehensive way. So I designed my own tests (and the code behind the tests is available on GitHub). Below are my answers to the three questions above. MRO has an optional MKL library and unless noted otherwise the observations hold true, whether MKL is installed on MRO or not.</span></p>
<p></p>
<p><span><a href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b8d1d272f8970c-pi" target="_blank" rel="noopener"><img src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b8d1d272f8970c-pi?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>What do R, MRO, and MRS have in common?</b></span></p>
<p><span>After installing R, MRO, and MRS, you'll notice that everything you can do in R can be done in MRO or MRS. For example, you can use glm() to fit a logistic regression and kmeans() to carry out cluster analysis. As another example, you can install packages from CRAN. In fact, a package installed in R can be used in MRO or MRS and vice versa if the package is installed in a library tree that's shared among them. You can use the command .libPaths() to set and get library trees for R, MRO and MRS. Finally, you can use your favorite IDEs such as RStudio and Visual Studio with RTVS for R, MRO or MRS. In other words, MRO and MRS are 100% compatible with R in terms of functions, packages, and IDEs.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>What’s new in MRO and MRS compared with R?</b></span></p>
<p><span>While everything you do in R can done in MRO and MRS, the reverse is not true, due to the additional components in MRO and MRS. MRO allows users to install an optional math library MKL for multithreaded performance. This library shows up as a package named "RevoUtilsMath" in MRO.</span></p>
<p><span>MRS comes with more packages and functions than R. From the package perspective, most of the additional ones are not on CRAN and are available only after installing MRS. One such example is the RevoScaleR package. MRS also installs the MKL library by default. As for functions, MRS has High Performance Analysis (HPA) version of many base R functions, which are included in the RevoScaleR package. For example, the HPA version of glm() is rxGlm() and for kmeans() it is rxKmeans(). These HPA functions can be used in the same way as their base R counterparts with additional options. In addition, these functions can work with a special data format (XDF) that's customized for MRS.</span></p>
<p></p>
<p><i>To read the rest of the article, click <a href="https://blog.revolutionanalytics.com/2016/04/data-scientist-perspective.html?utm_content=buffer57b80" target="_blank" rel="noopener">here</a>.</i></p>MBA guide: 8 resources to go from the spreadsheet to the command linetag:www.datasciencecentral.com,2020-04-25:6448529:BlogPost:9476342020-04-25T10:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://medium.com/@dmca?source=post_page-----cbb59ea82144----------------------" rel="noopener" target="_blank">Daniel McAuley</a></i><span><i>.</i></span></p>
<p></p>
<p><span>I recently had the pleasure of speaking on a few panels about analytics to my fellow MBA students and alumni, as well as many Penn undergrads. After these talks, I’ve been asked for my advice on what the best resources are for someone coming from the business world (i.e.,…</span></p>
<p><i>This article was written by <a href="https://medium.com/@dmca?source=post_page-----cbb59ea82144----------------------" target="_blank" rel="noopener">Daniel McAuley</a></i><span><i>.</i></span></p>
<p></p>
<p><span>I recently had the pleasure of speaking on a few panels about analytics to my fellow MBA students and alumni, as well as many Penn undergrads. After these talks, I’ve been asked for my advice on what the best resources are for someone coming from the business world (i.e., non-technical) who wants to develop the skills to become an effective data scientist. This post is an attempt to codify the advice I give and general resources I point people towards. Hopefully, this will make what I have learned accessible to more people and provide some guidance for those who realize that the future belongs to the empirically inclined (see below) but don’t know where to start their journey to becoming part of the club.</span></p>
<p><span>However, I would caution the reader that what I propose here is only a starting point on a journey towards really understanding the power of good data science. And, as Sean Taylor once told me, learn only what you need to accomplish your goal; if there are things on this list that you know you don’t need then skip them, you won’t hurt my feelings. At its core, data science is really about curiosity, optimism, and continual learning, all of which are ongoing habits rather than boxes to be checked. Therefore, I expect this list to evolve as the tools themselves change and as I continue to discover more about data science itself.</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/1238/1*CfsIkA5XuTdTuSphJn1J6g.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/1238/1*CfsIkA5XuTdTuSphJn1J6g.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>1. Linear Algebra</b></span></p>
<p><span>Linear algebra is a topic that underlies a lot of the statistical techniques and machine learning algorithms that you will employ as a data scientist. I like to recommend a MOOC I took through Coursera years ago, Coding the Matrix: Linear Algebra through Computer Science Applications. As the name implies, the course teaches linear algebra in the context of computer science (specifically using Python, which lends itself well to data science). There is also an optional companion textbook that makes a great reference manual.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>2. R</b></span></p>
<p><span>Given that we use R at Wealthfront, I have a few resources that I think are important here. The first, written by Garrett Grolemund and Hadley Wickham, R for Data Science will be published in physical form in July 2016 but is available for free online now. And rather than explain what the book is about in my own words.</span> <span>If you only read one data science book, it should be this.</span></p>
<p><span>Next up, our friend Hadley has also written Advanced R, which covers functional programming, metaprogramming, and performant code as well as the quirks of R.</span></p>
<p><span>Hadley is also responsible for some of the packages I use every day that make 90% of common data science tasks quicker and less verbose. I recommend checking out the following libraries; they will change the way you write code in R:</span></p>
<ul>
<li><span>ggplot2 — An implementation of the Grammar of Graphics in R</span></li>
<li><span>devtools —Tools to make an R developer’s life easier</span></li>
<li><span>dplyr — Plyr specialized for data frames: faster & with remote data stores</span></li>
<li><span>purrr — Make your pure R function purrr with functional programming</span></li>
<li><span>tidyr — Easily tidy data with spread and gather functions</span></li>
<li><span>lubridate — Make working with dates in R just that little bit easier</span></li>
<li><span>testthat — An R package to make testing fun</span></li>
</ul>
<p><span>For extra credit, check out yet another of Hadley’s books: R Packages. This is a great follow-up resource for those of you that want to write reproducible, well-documented R code that other people can easily use (other people includes your future self!)</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>3. SQL</b></span></p>
<p><span>This is probably the easiest section of the guide as you can teach yourself most of SQL in a few hours. Code School has both introductory and intermediate courses that you can get through in an afternoon.</span></p>
<p><span>The Sequel to SQL covers everything from aggregate functions and joins to normalization and subqueries. And while mastering these skills takes practice, you can still get an idea of what SQL can and cannot do without too much work.<span class="Apple-converted-space"> </span></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>4. Bayesian Reasoning</b></span></p>
<p><span>this book is probably one of the best all-around resources for learning how to do data science in R.</span></p>
<p><span>Without wading into the age-old Frequentist vs. Bayesian debate (or non-debate), I think that a solid foundation in Bayesian reasoning and statistics is a crucial part of any data scientist’s repertoire. For example, Bayesian reasoning underpins much of modern A/B testing and Bayesian methods are applied in many other areas of data science (and are generally covered less in introductory statistics courses).</span></p>
<p><span>John K. Kruschke has a great ability to break down complex material and convey it in a way that is intuitive and practical. Along with R for Data Science, this book is probably one of the best all-around resources for learning how to do data science in the R programming language.</span></p>
<p><span>Additionally, Kruschke’s blog makes a great companion resources to the textbook if you’re looking for more examples of problems to solve or answers to questions you still have after reading the book. And if a textbook isn’t exactly what you’re looking for, then Rasmus Bååth’s research blog, Publishable Stuff, is another great resource for learning about Bayesian approaches to problem-solving.</span></p>
<p></p>
<p><span><i>To read the whole article, with the link for each resource,</i></span> <i>click <a href="https://medium.com/@dmca/the-mba-data-science-toolkit-8-resources-to-go-from-the-spreadsheet-to-the-command-line-cbb59ea82144" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>The most underutilized function in SQLtag:www.datasciencecentral.com,2020-04-25:6448529:BlogPost:9477022020-04-25T09:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://blog.getdbt.com/author/tristan/" rel="noopener" target="_blank">Tristan Handy</a>.</i></p>
<p></p>
<p><span>Over the past nine months I’ve worked with over a dozen venture-funded startups to build out their internal analytics. In doing so, there’s a single SQL function that I have come to use surprisingly often. At first it wasn’t at all clear to me why I would want to use this function, but as time goes on I have found ever more uses for…</span></p>
<p><i>This article was written by <a href="https://blog.getdbt.com/author/tristan/" target="_blank" rel="noopener">Tristan Handy</a>.</i></p>
<p></p>
<p><span>Over the past nine months I’ve worked with over a dozen venture-funded startups to build out their internal analytics. In doing so, there’s a single SQL function that I have come to use surprisingly often. At first it wasn’t at all clear to me why I would want to use this function, but as time goes on I have found ever more uses for it.</span></p>
<p><span>What is it? <strong>md5()</strong>.</span></p>
<p><span>Give <strong>md5()</strong> a varchar and it returns its <a href="https://en.wikipedia.org/wiki/MD5" target="_blank" rel="noopener">MD5 hash</a>. Simple…but seemingly pointless. Why exactly would I want to use that?!</span></p>
<p><span>Great question. In this post I’m going to show you two uses for md5() that make it one of the most powerful tools in my SQL kit.</span></p>
<p></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/4510843426?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/4510843426?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>#1: Building Yourself a Unique ID</b></span></p>
<p><span>I’m going to make a really strong statement here, but it’s one that I really believe in: every single data model in your warehouse should have a rock solid unique ID.</span></p>
<p><span>It’s extremely common for this not to be the case. One reason is that your source data doesn’t have a unique key—if you’re syncing advertising performance data from Facebook Ads via Stitch or Fivetran, the source data in your ad_insights table doesn’t have a unique key you can rely on. Instead, you have a combination of fields that is reliably unique (in this case date and ad_id). Using that knowledge, you can build yourself a unique id using md5().</span></p>
<p><span>The resulting hash is a meaningless string of alphanumeric text that functions as a unique identifier for your record. Of course, you could just as easily just create a single concatenated varchar field that performed the same function, but it’s actually important to obfuscate the underlying logic behind the hash: you will innately treat the field differently if it looks like an id versus if it looks like a jumble of human-readable text.</span></p>
<p><span>There are a couple of reasons why creating a unique id is an important practice:</span></p>
<ul>
<li><span>One of the most common causes of error is duplicate values in a key that an analyst was expecting to be unique. Joins on that field will “fan out” a result set in unexpected ways and can cause significant error that is difficult to troubleshoot. To avoid this, only join on fields where you’ve validated the cardinality and constructed a unique key where necessary.</span></li>
<li><span>Some BI tools require you to have a unique key in order to provide certain functionality. For instance, Looker symmetric aggregates require a unique key in order to function.</span></li>
</ul>
<p><span>We create unique keys for every table and then test uniqueness on this key using dbt schema tests. We run these tests multiple times per day on Sinter and get notifications for any failures. This allows us to be completely confident of the analytics we implement on top of these data models.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>#2: Simplifying Complex Joins</b></span></p>
<p><span>This case is similar to #1 in its execution but it solves a very different puzzle. Imagine the following case. You have the same Facebook Ads dataset as referenced earlier but this time you have a new challenge: join that data to data in your web analytics sessions table so that you can calculate Facebook ROAS.</span></p>
<p><span>In this case, your available join keys are the date and your UTM parameters (utm_medium, source, campaign, etc). Seems easy, right? Just do a join on all 6 fields and call it a day.</span></p>
<p><span>Unfortunately that doesn’t work, for a really simple reason: it’s extremely common for some subset of those fields to be null, and a null doesn’t join to another null. So, that 6-field join is a dead end. You can hack together something incredibly complicated using a bunch of conditional logic, but that code is hideous and performs terribly (I’ve tried it).</span></p>
<p><span>Instead, use md5(). In both datasets, you can take the 6 fields we mentioned and concatenate them together into a single string, and then call md5() on the entire string.</span></p>
<p></p>
<p><span><i>To read the whole article, with illustrations and examples,</i></span> <i>click <a href="https://blog.getdbt.com/the-most-underutilized-function-in-sql/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Comparing Regression Lines with Hypothesis Teststag:www.datasciencecentral.com,2020-04-18:6448529:BlogPost:9460102020-04-18T16:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://statisticsbyjim.com/author/statis11_wp/" rel="noopener" target="_blank">Jim Frost</a></i><span><i>.</i></span></p>
<p></p>
<p><span>How do you compare regression lines statistically? Imagine you are studying the relationship between height and weight and want to determine whether this relationship differs between basketball players and non-basketball players. You can graph the two regression lines to see if they look different. However, you…</span></p>
<p><i>This article was written by <a href="https://statisticsbyjim.com/author/statis11_wp/" target="_blank" rel="noopener">Jim Frost</a></i><span><i>.</i></span></p>
<p></p>
<p><span>How do you compare regression lines statistically? Imagine you are studying the relationship between height and weight and want to determine whether this relationship differs between basketball players and non-basketball players. You can graph the two regression lines to see if they look different. However, you should perform hypothesis tests to determine whether the visible differences are statistically significant. In this blog post, I show you how to determine whether the differences between coefficients and constants in different regression models are statistically significant.</span></p>
<p><span>Suppose we estimate the relationship between X and Y under two different conditions, processes, contexts, or other qualitative change. We want to determine whether the difference affects the relationship between X and Y. Fortunately, these statistical tests are easy to perform.</span></p>
<p><span>For the regression examples in this post, I use an input variable and an output variable for a fictional process. Our goal is to determine whether the relationship between these two variables changes between two conditions. First, I’ll show you how to determine whether the constants are different. Then, we’ll assess whether the coefficients are different.</span></p>
<p></p>
<p><span><a href="https://i2.wp.com/statisticsbyjim.com/wp-content/uploads/2017/07/scatter_constant_dift.png?w=576&ssl=1" target="_blank" rel="noopener"><img src="https://i2.wp.com/statisticsbyjim.com/wp-content/uploads/2017/07/scatter_constant_dift.png?w=576&ssl=1&profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><strong>Content of this article:</strong></p>
<ul>
<li><span style="font-size: 12pt;">Hypothesis Tests for Comparing Regression Constants</span></li>
<li><span style="font-size: 12pt;">Interpreting the results</span></li>
<li><span style="font-size: 12pt;">Hypothesis Tests for Comparing Regression Coefficients</span></li>
<li><span style="font-size: 12pt;">Interpreting the results</span></li>
</ul>
<p><i>To read he whole article, with illustrations and equations, click <a href="https://statisticsbyjim.com/regression/comparing-regression-lines/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>Multicollinearity in Regression Analysis: Problems, Detection, and Solutionstag:www.datasciencecentral.com,2020-04-12:6448529:BlogPost:9444932020-04-12T11:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://statisticsbyjim.com/author/statis11_wp/" rel="noopener" target="_blank">Jim Frost</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Multicollinearity occurs when independent variables in a regression model are correlated. This correlation is a problem because independent variables should be independent. If the degree of correlation between variables is high enough, it can cause problems when you fit the model and interpret the results.…</span></p>
<p><i>This article was written by <a href="https://statisticsbyjim.com/author/statis11_wp/" target="_blank" rel="noopener">Jim Frost</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Multicollinearity occurs when independent variables in a regression model are correlated. This correlation is a problem because independent variables should be independent. If the degree of correlation between variables is high enough, it can cause problems when you fit the model and interpret the results.<i><span class="Apple-converted-space"> </span></i></span></p>
<p><span>In this blog post, I’ll highlight the problems that multicollinearity can cause, show you how to test your model for it, and highlight some ways to resolve it. In some cases, multicollinearity isn’t necessarily a problem, and I’ll show you how to make this determination. I’ll work through an example dataset which contains multicollinearity to bring it all to life!</span></p>
<p></p>
<p><a href="https://i2.wp.com/statisticsbyjim.com/wp-content/uploads/2017/04/femoral_neck.png?resize=300%2C226&ssl=1" target="_blank" rel="noopener"><img src="https://i2.wp.com/statisticsbyjim.com/wp-content/uploads/2017/04/femoral_neck.png?resize=300%2C226&ssl=1&profile=RESIZE_710x" class="align-full"/></a></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Why is Multicollinearity a Potential Problem?</b></span></p>
<p><span>A key goal of regression analysis is to isolate the relationship between each independent variable and the dependent variable. The interpretation of a regression coefficient is that it represents the mean change in the dependent variable for each 1 unit change in an independent variable when you hold all of the other independent variables constant. That last portion is crucial for our discussion about multicollinearity.</span></p>
<p><span>The idea is that you can change the value of one independent variable and not the others. However, when independent variables are correlated, it indicates that changes in one variable are associated with shifts in another variable. The stronger the correlation, the more difficult it is to change one variable without changing another. It becomes difficult for the model to estimate the relationship between each independent variable and the dependent variable independently because the independent variables tend to change in unison.</span></p>
<p><span>There are two basic kinds of multicollinearity:</span></p>
<ul>
<li><span>Structural multicollinearity: This type occurs when we create a model term using other terms. In other words, it’s a byproduct of the model that we specify rather than being present in the data itself. For example, if you square term X to model curvature, clearly there is a correlation between X and X2.</span></li>
<li><span>Data multicollinearity: This type of multicollinearity is present in the data itself rather than being an artifact of our model. Observational experiments are more likely to exhibit this kind of multicollinearity.</span></li>
</ul>
<p></p>
<p><span style="font-size: 14pt;"><b>What Problems Do Multicollinearity Cause?</b></span></p>
<p><span>Multicollinearity causes the following two basic types of problems:</span></p>
<ul>
<li><span>The coefficient estimates can swing wildly based on which other independent variables are in the model. The coefficients become very sensitive to small changes in the model.</span></li>
<li><span>Multicollinearity reduces the precision of the estimate coefficients, which weakens the statistical power of your regression model. You might not be able to trust the p-values to identify independent variables that are statistically significant.</span></li>
</ul>
<p><span>Imagine you fit a regression model and the coefficient values, and even the signs, change dramatically depending on the specific variables that you include in the model. It’s a disconcerting feeling when slightly different models lead to very different conclusions. You don’t feel like you know the actual effect of each variable!</span></p>
<p><span>Now, throw in the fact that you can’t necessarily trust the p-values to select the independent variables to include in the model. This problem makes it difficult both to specify the correct model and to justify the model if many of your p-values are not statistically significant.</span></p>
<p><span>As the severity of the multicollinearity increases so do these problematic effects. However, these issues affect only those independent variables that are correlated. You can have a model with severe multicollinearity and yet some variables in the model can be completely unaffected.</span></p>
<p><span>The regression example with multicollinearity that I work through later on illustrates these problems in action.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Do I Have to Fix Multicollinearity?</b></span></p>
<p><span>Multicollinearity makes it hard to interpret your coefficients, and it reduces the power of your model to identify independent variables that are statistically significant. These are definitely serious problems. However, the good news is that you don’t always have to find a way to fix multicollinearity.</span></p>
<p><span>The need to reduce multicollinearity depends on its severity and your primary goal for your regression model. Keep the following three points in mind:</span></p>
<ol>
<li><span>The severity of the problems increases with the degree of the multicollinearity. Therefore, if you have only moderate multicollinearity, you may not need to resolve it.</span></li>
<li><span>Multicollinearity affects only the specific independent variables that are correlated. Therefore, if multicollinearity is not present for the independent variables that you are particularly interested in, you may not need to resolve it. Suppose your model contains the experimental variables of interest and some control variables. If high multicollinearity exists for the control variables but not the experimental variables, then you can interpret the experimental variables without problems.</span></li>
<li><span>Multicollinearity affects the coefficients and p-values, but it does not influence the predictions, precision of the predictions, and the goodness-of-fit statistics. If your primary goal is to make predictions, and you don’t need to understand the role of each independent variable, you don’t need to reduce severe multicollinearity.</span></li>
</ol>
<p><span>Over the years, I’ve found that many people are incredulous over the third point, so here’s a reference!</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Testing for Multicollinearity with Variance Inflation Factors (VIF)</b></span></p>
<p><span>If you can identify which variables are affected by multicollinearity and the strength of the correlation, you’re well on your way to determining whether you need to fix it. Fortunately, there is a very simple test to assess multicollinearity in your regression model. The variance inflation factor (VIF) identifies correlation between independent variables and the strength of that correlation.</span></p>
<p><span>Statistical software calculates a VIF for each independent variable. VIFs start at 1 and have no upper limit. A value of 1 indicates that there is no correlation between this independent variable and any others. VIFs between 1 and 5 suggest that there is a moderate correlation, but it is not severe enough to warrant corrective measures. VIFs greater than 5 represent critical levels of multicollinearity where the coefficients are poorly estimated, and the p-values are questionable.</span></p>
<p><span>Use VIFs to identify correlations between variables and determine the strength of the relationships. Most statistical software can display VIFs for you. Assessing VIFs is particularly important for observational studies because these studies are more prone to having multicollinearity.</span></p>
<p></p>
<p></p>
<p><i>To read the whole article, with links, illustrations and developed example, click <a href="https://statisticsbyjim.com/regression/multicollinearity-in-regression-analysis/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>The 17 equations that changed the course of historytag:www.datasciencecentral.com,2020-04-03:6448529:BlogPost:9428612020-04-03T16:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.businessinsider.com/author/andy-kiersz?IR=T" rel="noopener" target="_blank">Andy Kiersz</a></i><span><i>.</i></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>From Ian Stewart's book, these 17 math equations changed the course of human history</b></span></p>
<ul>
<li><span>A 2013 book by mathematician and science author Ian Stewart looked at 17 mathematical equations that shaped our understanding of the…</span></li>
</ul>
<p><i>This article was written by <a href="https://www.businessinsider.com/author/andy-kiersz?IR=T" target="_blank" rel="noopener">Andy Kiersz</a></i><span><i>.</i></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>From Ian Stewart's book, these 17 math equations changed the course of human history</b></span></p>
<ul>
<li><span>A 2013 book by mathematician and science author Ian Stewart looked at 17 mathematical equations that shaped our understanding of the world.</span></li>
<li><span>From basic geometry to our understanding of how the physical world works to the theories underlying the internet and our financial systems, these equations have changed human history.</span></li>
</ul>
<p><span>Mathematics is all around us, and it has shaped our understanding of the world in countless ways.</span></p>
<p><span>In 2013, mathematician and science author Ian Stewart published a book on "17 Equations That Changed The World."<i><span class="Apple-converted-space"> </span></i></span></p>
<p></p>
<p><span><i><span class="Apple-converted-space"><a href="https://tra.img.pmdstatic.net/fit/https.3A.2F.2Fi.2Einsider.2Ecom.2F53207c206bb3f7933591a886/812x609/background-color/ffffff/quality/70/17-equations-that-changed-the-world-2014-3.jpg" target="_blank" rel="noopener"><img src="https://tra.img.pmdstatic.net/fit/https.3A.2F.2Fi.2Einsider.2Ecom.2F53207c206bb3f7933591a886/812x609/background-color/ffffff/quality/70/17-equations-that-changed-the-world-2014-3.jpg?profile=RESIZE_710x" class="align-full"/></a></span></i></span></p>
<p></p>
<p></p>
<p><span><i>To read the full article, with illustrations for each equation,</i></span> <i>click <a href="https://www.businessinsider.fr/uk/17-equations-that-changed-the-world-2014-3" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>#TextAnalytics concepts can be used to deal with credibility issues in the main stream mediatag:www.datasciencecentral.com,2020-03-29:6448529:BlogPost:9413682020-03-29T16:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.linkedin.com/in/ramsundarlakshminarayanan/?lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3BYX00NRPiTSeC3mWOS9jAPw%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_pulse_read-read_profile" rel="noopener" target="_blank">Ramsundar Lakshminarayanan</a></i><span><i>.</i></span></p>
<p></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Main stream media's credibility at an all time low</b></span></p>
<p><span>Credibility of the media has taken a…</span></p>
<p><i>This article was written by <a href="https://www.linkedin.com/in/ramsundarlakshminarayanan/?lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3BYX00NRPiTSeC3mWOS9jAPw%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_pulse_read-read_profile" target="_blank" rel="noopener">Ramsundar Lakshminarayanan</a></i><span><i>.</i></span></p>
<p></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Main stream media's credibility at an all time low</b></span></p>
<p><span>Credibility of the media has taken a beating in recent years. Elections of #Modi, #Brexit and #Trump are widely believed to be a testament to that. These events have helped expose the misinformation and propaganda that occupy a pivotal space in mainstream media literature. It is a discourse culture that is devoid of facts and pregnant with bias.</span></p>
<p><span>It is an unfortunate reality.</span></p>
<p><span>It is in this context I leveraged existing concepts in #TextAnalytics such as #TopicModel and #SentimentAnalysis to objectively assess media content & provide pertinent information to potential reader in easy to understand visual manner.</span></p>
<p><span>Imagine if you had advance information that describes the content well - about a talk show or an article, that will help you decide if you want to sit through the talk show or read the entire article. I explored ways to glean that "relevant information" in an unbiased manner.</span></p>
<p><span>With nascent knowledge in #TextAnalytics, I hunted for a suitable hypothesis to work on.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Hypothesis</b></span></p>
<p><span>Mani Shankar Aiyar is a former Indian diplomat and a politician, widely perceived to be the one who catapulted Indian Prime Minister Mr Narendra Modi's 2014 election campaign with disparaging remarks about Modi's social status as a child tea vendor. Since the election of Modi in 2014, I found his opinion pieces in ndtv.com, to be extremely critical of Modi, negative in tone, condescending in tenor, and notwithstanding his party's drubbing in 2014 elections, rich in temerity. I lost interest in his pieces and stopped reading them in Fall of 2014.</span></p>
<p><span>His articles gave me the perfect hypothesis to start with and Trump's victory gave me the perfect trigger to test my hypothesis.<span class="Apple-converted-space"> </span></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Topic Modeling and Sentiment Analysis</b></span></p>
<p><span>A total of 155 articles, published by the author in ndtv.com between Jan '14 and Mar '17 was used for analysis purposes. While not being very large, the corpus still offered a decent sample size to test my hypothesis. Here are my findings.</span></p>
<p></p>
<p><span><b>- Findings 1</b></span><span>:</span> <span><b>What topics does he usually write about?</b></span></p>
<p><span>Following key topics were detected in an automated manner using a popular topic modeling technique. 5 topics- Modi, Gandhi, India, Pakistan & Time that aligned well with my personal knowledge of the author's interest.</span></p>
<p></p>
<p><span><b>- Findings 2</b></span><span>:</span> <span><b>Top Subjects he focused on</b></span></p>
<p><span>Top 10 subjects he wrote about were Modi, Pakistan, India, Gandhi, Govern (government/governance), PM, Will (tendency to assert & question), BJP, Jaitley & China. He wrote about Modi in 74 of the 155 articles between January 2014 and March 2017. List of subjects matches with my personal knowledge of the author's interest.</span></p>
<p></p>
<p><span><b>- Findings 3</b></span><span><b>:</b></span> <span><b>What sentiments did he exhibit?</b></span></p>
<p><span>More than 50% of the articles exhibited a negative tone. This was discovered using open source lexicons that classify words into positive, negative or neutral sentiment. Hardly 25% of his articles exhibited positive sentiment. A fraction of it is neutral. Here again, findings align well with my personal experience reading his articles earlier.</span></p>
<p></p>
<p><span><b>- Findings 4</b></span><span>:</span> <span><b>What Emotion did he express?</b></span></p>
<p><span>Articles exhibited all of the following emotions - anger, anticipation, disgust, fear, joy, sadness, surprise & trust as shown in the below visual. This was discovered using open source lexicons that classify words into emotions. Relating it to sentiments, aggregates predominantly into a negative tenor. This again, aligns well with my personal experience reading the authors articles earlier.</span></p>
<p></p>
<p><span><b>- Findings 5: What more can be described about his articles?</b></span></p>
<p><span>Top key words were derived for each article based on a popular information retrieval technique (#TF-IDF) to describe the content better. This helps to get further understanding of the content without having to read the entire article. See below for example.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Summary of the Analysis</b></span></p>
<p><span>With these findings, it was possible for me to gain a much better understanding of his articles fairly quickly.</span></p>
<p><span>Automatic topic detection and open source lexicons provided neutrality and transparency to the analysis, while popular information retrieval technique provided legitimacy to the analysis.</span></p>
<p><span><i><span class="Apple-converted-space"> </span></i></span></p>
<p><i>To read the whole article, with illustrations, click <a href="https://www.linkedin.com/pulse/textanalytics-concepts-can-used-deal-credibility-main-ramsundar/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Limits of linear models for forecastingtag:www.datasciencecentral.com,2020-03-29:6448529:BlogPost:9412982020-03-29T16:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.linkedin.com/in/blainebateman/?lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3B9XJlwZoESa6EMiKFMTuIlw%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_pulse_read-read_profile" rel="noopener" target="_blank">Blaine Bateman</a></i><span><i>.</i></span></p>
<p></p>
<p><span>In this post, I will demonstrate the use of nonlinear models for time series analysis, and contrast to linear models. I will use a (simulated) noisy and nonlinear time…</span></p>
<p><i>This article was written by <a href="https://www.linkedin.com/in/blainebateman/?lipi=urn%3Ali%3Apage%3Ad_flagship3_pulse_read%3B9XJlwZoESa6EMiKFMTuIlw%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_pulse_read-read_profile" target="_blank" rel="noopener">Blaine Bateman</a></i><span><i>.</i></span></p>
<p></p>
<p><span>In this post, I will demonstrate the use of nonlinear models for time series analysis, and contrast to linear models. I will use a (simulated) noisy and nonlinear time series of sales data, use multiple linear regression and a small neural network to fit training data, then predict 90 days forward. I implemented all of this in R, although it could be done in a number of coding environments. (Specifically, I used R 3.4.2 in RStudio 1.1.183 in Windows 10).</span></p>
<p><span>It is worth noting that much of what is presented in the literature and trade media regarding neural networks concerns classification problems. Classification means there are a finite number of correct answers given a set of inputs. In image recognition, an application well served by neural networks, classification would include dog/not dog. This is a simplistic example and such methods can predict a very large number of classes, such as reading addresses on mail with machine vision and automatically sorting for delivery. In this post, I am exploring models that produce continuous outputs instead of a finite number of discrete outputs. Neural networks and other methods are very applicable to continuous prediction.</span></p>
<p></p>
<p><span><a href="https://media-exp1.licdn.com/dms/image/C5612AQHguTuLxuj2Bg/article-inline_image-shrink_1000_1488/0?e=1591228800&v=beta&t=Br-frzrvnEtevHJfgvw0hQN7xqsJDMTTodKLGuFc2m4" target="_blank" rel="noopener"><img src="https://media-exp1.licdn.com/dms/image/C5612AQHguTuLxuj2Bg/article-inline_image-shrink_1000_1488/0?e=1591228800&v=beta&t=Br-frzrvnEtevHJfgvw0hQN7xqsJDMTTodKLGuFc2m4&profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span>Another point to note is that there are many empirical methods available for time series analysis. For example, ARIMA (auto regressive integrative moving average) and related methods use a combination of time lagged data to predict the future. Often these approaches are used for relatively short-term prediction. In this post, I want to use business knowledge and data in a model to predict future sales. My view is that such models are more likely to behave well over time, and can be adapted for business changes by adding or removing factors deemed newly important or found unimportant.</span></p>
<p><span>The linear regression method is available in base R using the lm() function. For the neural network, I used the RPROP Algorithm, published by Martin Riedmiller and Heinrich Braun of the University of Karlsruhe. RPROP is a useful variation of neural network modeling that, in some forms, automatically finds the appropriate learning rate. </span></p>
<p><span>For my purposes, I mainly use the rprop+ version of the algorithm, very nicely implemented by Stefan Fritsch & Frauke Guenther with contributors Marc Suling & Sebastian M. Mueller. The implementation is available as a library as well as source code. rprop+ appears to be quite resilient in that it easily converges without a lot of hyperparameter tuning. This is important to my point here which is implementation of nonlinear modeling isn’t necessarily more difficult than linear models.</span></p>
<p><span>The data are a simulated time series of sales data, which has spikes at quarterly and smaller periods, as well as longer term variations. There is about 3 ¼ years of data at daily granularity, and I want to test the potential to use the first 3 years as training data, then predict another 90 days in the future. The business case is that it is believed there are various factors that influence sales, some internal to our business and some external. We have a set of 8 factors, one of which is past sales, the remaining being market factors (such as GDP, economic activity, etc.) and internal data (such as sales pipeline, sales incentive programs, new product introductions (NPI), etc.). The past sales are used with phasing of one year, at which it is arrived by noting there are annual business cycles. (Note: there are many more rigorous ways to determine phasing; I’ll address that in another post.) These factors are labeled a, c, f, g, h, i, j, and k in what follows. The sales values are labeled as Y. For each model then, the 1210 daily values of the 8 factors are provided, plus the actual sales results, and build a model that fits the historical data as well as possible. </span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Linear regression</b></span></p>
<p><span>Using the lm() function in R, I fit a linear model to the data. Linear means that each factor is multiplied by a coefficient (determined by the fit process) and these are simply added together to estimate the resulting sales. The equation looks as follows:</span></p>
<p><span>Y = C1*a + C2*c + C3*f + C4*g + C5*h + C6*I + C7*j + C8*k + C0</span></p>
<p><span>where, as noted, Y is the sales. Note that C0 is a constant value that is also determined by the regression modeling. Once I have such as model, sales are predicted by simply multiplying an instance of the factors by the coefficients and summing to get a prediction for sales. To predict future sales, values of the factors are needed. If the factors are not time lagged from the sales, then, for example, a forecast for GDP or the future NPI plan would be needed. Depending on the specific case, all the factors might be time lagged values and future sales can be predicted from known data. In some cases, a forecast is needed for some of the factors. These details are not important for the evaluation here.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Neural network tests</b></span></p>
<p><span>As a first step, I will use a simple neural network that has 8 input nodes (one for each factor) plus the “bias” node (The bias node is motivated by understanding the behavior of a single unit, also known as a perceptron. Including a bias allows a single perceptron to mimic the entire range of logical operators (like AND, OR, XOR, etc.) and thus is usually included in the network architecture). These 9 nodes are fed into a single hidden layer of 3 nodes, which, along with a bias node, are fed into the output node.</span></p>
<p><span>There are (at least) two ways that a neural network representation can model nonlinear behavior. First, every node from a given layer is connected to every node of the next layer. These connections are multiplied by weights before feeding into the next node, where the weights are determined in the modeling process. These cross connections can model interactions between factors that the linear model does not. In addition, a typical neural network node uses a nonlinear function to determine the node output from the node inputs. These functions are often called activation functions, another reference to organic neuron behavior. A common activation function is the sigmoid or logistic function. </span></p>
<p></p>
<p><span><i>To read the whole article, with illustrations and their explanations,</i></span> <span><i>click <a href="https://www.linkedin.com/pulse/limits-linear-models-forecasting-blaine-bateman-eaf-llc/" target="_blank" rel="noopener">here</a>.</i></span></p>
<p></p>Finding organic clusters in complex data-networkstag:www.datasciencecentral.com,2020-03-09:6448529:BlogPost:9368032020-03-09T12:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://medium.com/@graphcommons?source=post_page-----5c27e1d4645d----------------------" rel="noopener" target="_blank">Graph Commons</a></i><span><i>.</i></span></p>
<p></p>
<p><span>A common task for a data scientist is to identify clusters in a given data set. The idea is to simply find groups of objects that have more connections or similarities to one another than they do to outsiders. In the study of networks, we use clustering to recognize…</span></p>
<p><i>This article was written by <a href="https://medium.com/@graphcommons?source=post_page-----5c27e1d4645d----------------------" target="_blank" rel="noopener">Graph Commons</a></i><span><i>.</i></span></p>
<p></p>
<p><span>A common task for a data scientist is to identify clusters in a given data set. The idea is to simply find groups of objects that have more connections or similarities to one another than they do to outsiders. In the study of networks, we use clustering to recognize communities within large groups of connections.</span></p>
<p><span>Typically, a force-directed layout algorithm organizes a network map, makes patterns visually comprehensible, but it cannot identify and mark the clusters. Furthermore, in large network maps, the high level of detail overwhelms our senses. To be able to precisely examine its patterns, we need quantitative views of the data contained in the network. While there are a variety of data clustering methods in machine learning, the Louvain Modularity algorithm works well particularly for large data-networks. It detects tightly knit groups characterized by a relatively high density of ties. Beyond the visual realm, you can use a Louvain clustering algorithm to partition a many million-node online social network onto different machines.</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/1014/1*huqDS6D5nbpNl6RiBItYnQ.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/1014/1*huqDS6D5nbpNl6RiBItYnQ.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span>Once the network clusters are detected, the identified groups of nodes can be given distinct color and names, so they are clearly differentiated and together provide a summary of the larger network. We can label a cluster based on the commonalities of its nodes or the most central nodes found in the grouping.</span></p>
<p><span>In Graph Commons, you can use clustering on your data-networks using the Analysis bar. You first click on the “Run Clustering” button, then set the resolution of how much granular clusters the algorithm should identify. Once the clusters are found, they are automatically labelled based on the most connected node in the cluster. However, we strongly recommend that you to rename these communities yourself to highlight what these communities specify in your context. Finally, you can view the list of all the nodes that belongs to a certain cluster and download it as a CSV file.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Cluster labels on the network map</b></span></p>
<p><span>In Graph Commons, you’ll notice the cluster labels are also placed on the map visually. You can move them around and change their size in order to make the network more readable.</span></p>
<p><span>When you mouse over a cluster label, it will be highlighted, this way you can clearly see its boundaries and where it is located the larger picture. Cluster labels on the map provide an overview for a complex network that is otherwise hard to grasp visually.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Bridges between clusters</b></span></p>
<p><span>Within the clusters of a complex network, we often see few nodes making connections to other clusters besides their neighbouring nodes, whose connections are only local, within their immediate cluster. Those nodes that are bridging connections among multiple clusters have high betweenness centrality. Such bridging nodes between two or more clusters become distinctly visible with the help of the network layout algorithms.</span></p>
<p><span>If we are analyzing a social network, these bridging people are well-positioned to be information brokers, since they have access to information flowing in other clusters. They are the ones who carry the gossip from one group of people to another. They are in a position to combine variety of knowledge and ideas found in multiple groups. On the other hand, bridging nodes have more likelihood of being a single point of failure. If a bridge person disappears, those formerly connected communities would disconnect.</span></p>
<p></p>
<p><span><i>To read the whole article, with illustrations,</i></span> <span><i>click <a href="https://medium.com/graph-commons/finding-organic-clusters-in-your-complex-data-networks-5c27e1d4645d" target="_blank" rel="noopener">here</a>.</i></span></p>
<p></p>
<p></p>
<p></p>Five Regression Analysis Tips to Avoid Common Problemstag:www.datasciencecentral.com,2020-03-09:6448529:BlogPost:9368022020-03-09T12:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://statisticsbyjim.com/author/statis11_wp/" rel="noopener" target="_blank">Jim Frost</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Regression is a very powerful statistical analysis. It allows you to isolate and understand the effects of individual variables, model curvature and interactions, and make predictions. Regression analysis offers high flexibility but presents a variety of potential pitfalls. Great power requires great…</span></p>
<p><i>This article was written by <a href="https://statisticsbyjim.com/author/statis11_wp/" target="_blank" rel="noopener">Jim Frost</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Regression is a very powerful statistical analysis. It allows you to isolate and understand the effects of individual variables, model curvature and interactions, and make predictions. Regression analysis offers high flexibility but presents a variety of potential pitfalls. Great power requires great responsibility!</span></p>
<p><span>In this post, I offer five tips that will not only help you avoid common problems but also make the modeling process easier. I’ll close by showing you the difference between the modeling process that a top analyst uses versus the procedure of a less rigorous analyst.</span></p>
<p></p>
<p><span><a href="https://i2.wp.com/statisticsbyjim.com/wp-content/uploads/2017/08/FLP_sample.png?resize=300%2C200&ssl=1" target="_blank" rel="noopener"><img src="https://i2.wp.com/statisticsbyjim.com/wp-content/uploads/2017/08/FLP_sample.png?resize=300%2C200&ssl=1&profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Tip 1: Conduct A Lot of Research Before Starting</b></span></p>
<p><span>Before you begin the regression analysis, you should review the literature to develop an understanding of the relevant variables, their relationships, and the expected coefficient signs and effect magnitudes. Developing your knowledge base helps you gather the correct data in the first place, and it allows you to specify the best regression equation without resorting to data mining.</span></p>
<p><span>Regrettably, large data bases stuffed with handy data combined with automated model building procedures have pushed analysts away from this knowledge based approach. Data mining procedures can build a misleading model that has significant variables and a good R-squared using randomly generated data!</span></p>
<p><span>In my blog post, Using Data Mining to Select Regression Model Can Create Serious Problems, I show this in action. The output below is a model that stepwise regression built from entirely random data. In the final step, the R-squared is decently high, and all of the variables have very low p-values!<span class="Apple-converted-space"> </span></span></p>
<p><span>Automated model building procedures can have a place in the exploratory phase. However, you can’t expect them to produce the correct model precisely. For more information, read my Guide to Stepwise Regression and Best Subsets Regression.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Tip 2: Use a Simple Model When Possible</b></span></p>
<p><span>It seems that complex problems should require complicated regression equations. However, studies show that simplification usually produces more precise models.* How simple should the models be? In many cases, three independent variables are sufficient for complex problems.</span></p>
<p><span>The tip is to start with a simple a model and then make it more complicated only when it is truly needed. If you make a model more complex, confirm that the prediction intervals are more precise (narrower). When you have several models with comparable predictive abilities, choose the simplest because it is likely to be the best model. Another benefit is that simpler models are easier to understand and explain to others!</span></p>
<p><span>As you make a model more elaborate, the R-squared increases, but it becomes more likely that you are customizing it to fit the vagaries of your specific dataset rather than actual relationships in the population. This overfitting reduces generalizability and produces results that you can’t trust.</span></p>
<p><span>Learn how both adjusted R-squared and predicted R-squared can help you include the correct number of variables and avoid overfitting.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Tip 3: Correlation Does Not Imply Causation . . . Even in Regression</b></span></p>
<p><span>Correlation does not imply causation. Statistics classes have burned this familiar mantra into the brains of all statistics students! It seems simple enough. However, analysts can forget this important rule while performing regression analysis. As you build a model that has significant variables and a high R-squared, it’s easy to forget that you might only be revealing correlation. Causation is an entirely different matter. Typically, to establish causation, you need to perform a designed experiment with randomization. If you’re using regression to analyze data that weren’t collected in such an experiment, you can’t be certain about causation.</span></p>
<p><span>Fortunately, correlation can be just fine in some cases. For instance, if you want to predict the outcome, you don’t always need variables that have causal relationships with the dependent variable. If you measure a variable that is related to changes in the outcome but doesn’t influence the outcome, you can still obtain good predictions. Sometimes it is easier to measure these proxy variables. However, if your goal is to affect the outcome by setting the values of the input variables, you must identify variables with truly causal relationships.</span></p>
<p><span>For example, if vitamin consumption is only correlated with improved health but does not cause good health, then altering vitamin use won’t improve your health. There must be a causal relationship between two variables for changes in one to cause changes in the other.</span></p>
<p></p>
<p><i>To read the rest of the article, click <a href="https://statisticsbyjim.com/regression/regression-analysis-tips/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>Compressing information through the information bottleneck during deep learningtag:www.datasciencecentral.com,2020-03-06:6448529:BlogPost:9359802020-03-06T12:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://silvertonconsulting.com/blog/author/administrator/" rel="noopener" target="_blank">Ray</a>.</i></p>
<p><span>Read an article in Quanta Magazine (New theory cracks open the black box of deep learning) about a talk (see 18: Information Theory of Deep Learning, YouTube video) done a month or so ago given by Professor Naftali (Tali) Tishby on his theory that all deep learning convolutional neural networks (CNN) exhibit an “information bottleneck”…</span></p>
<p><i>This article was written by <a href="https://silvertonconsulting.com/blog/author/administrator/" target="_blank" rel="noopener">Ray</a>.</i></p>
<p><span>Read an article in Quanta Magazine (New theory cracks open the black box of deep learning) about a talk (see 18: Information Theory of Deep Learning, YouTube video) done a month or so ago given by Professor Naftali (Tali) Tishby on his theory that all deep learning convolutional neural networks (CNN) exhibit an “information bottleneck” during deep learning. This information bottleneck results in compressing the information present, in for example, an image and only working with the relevant information.</span></p>
<p><span>The Professor and his researchers used a simple AI problem (like recognizing a dog) and trained a deep learning CNN to perform this task. At the start of the training process the CNN nodes at the top were all connected to the next layer, and those were all connected to the next layer and so on until you got to the output layer.</span></p>
<p><span>Essentially, the researchers found that during the deep learning process, the CNN went from recognizing all features of an image to over time just recognizing (processing?) only the relevant features of an image when successfully trained.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/4035928116?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/4035928116?profile=RESIZE_710x" class="align-center"/></a></span></p>
<p><span style="font-size: 12pt;">In this article, the following topics are discussed:</span></p>
<ul>
<li>Limits of deep learning CNNs</li>
<li>What happens during deep learning</li>
<li>Statistics of deep learning process</li>
<li>Do layer counts and sample size matter?</li>
</ul>
<p><i>To read the whole article, with illustrations, click <a href="http://silvertonconsulting.com/blog/2017/09/23/compressing-information-through-the-information-bottleneck-during-deep-learning/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Introduction to Numpy - A Math Library for Pythontag:www.datasciencecentral.com,2020-03-01:6448529:BlogPost:9353242020-03-01T21:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://hackernoon.com/@rakshithvasudev" rel="noopener" target="_blank">Vasudev</a></i><span><i>.</i></span></p>
<p><span>Lets get started quickly. Numpy is a math library for python. It enables us to do computation efficiently and effectively. It is better than regular python because of it’s amazing capabilities.</span></p>
<p><span>In this article I’m just going to introduce you to the basics of what is mostly required for machine learning and…</span></p>
<p><i>This article was written by <a href="https://hackernoon.com/@rakshithvasudev" target="_blank" rel="noopener">Vasudev</a></i><span><i>.</i></span></p>
<p><span>Lets get started quickly. Numpy is a math library for python. It enables us to do computation efficiently and effectively. It is better than regular python because of it’s amazing capabilities.</span></p>
<p><span>In this article I’m just going to introduce you to the basics of what is mostly required for machine learning and datascience. I’m not going to cover everything that’s possible with numpy library. This is the part one of numpy tutorial series.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/4035960368?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/4035960368?profile=RESIZE_710x" class="align-center"/></a></span></p>
<p style="text-align: center;"><em>A small illustrated summary of NumPy indexing and slicing (see <a href="https://scipy-lectures.org/intro/numpy/array_object.html" target="_blank" rel="noopener">here</a> for details)</em></p>
<p><span>The first thing I want to introduce you to is the way you import it.</span></p>
<p><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978030418?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978030418?profile=RESIZE_710x" class="align-full"/></a></p>
<p><span>Okay, now we’re telling python that “np” is the official reference to numpy from further on.</span></p>
<p><span>Let’s create python array and np array.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978033631?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978033631?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>If I were to print them, I wouldn’t see much difference.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978036966?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978036966?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>Okay, but why do I have to use an np array instead of a regular array?</span></p>
<p><span>The answer is that np arrays are better interms of faster computation and ease of manipulation.</span></p>
<p></p>
<p><span>Let’s proceed further with more cool stuff. Wait, there was nothing cool we saw yet! Okay, here’s something:</span></p>
<p><span>np.arange()</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978046526?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978046526?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>What arange([start],stop,[step]) does is that it arranges numbers from starting to stop, in steps of step. Here is what it means for np.arange(0,10,2):</span></p>
<p><span>return an np list starting from 0 all the way upto 10 but don’t include 10 and increment numbers by 2 each time.</span></p>
<p><span>So, that’s how we get :</span></p>
<p><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978049559?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978049559?profile=RESIZE_710x" class="align-full"/></a></p>
<p><span>important thing remember here is that the stopping number is not going to be included in the list.</span></p>
<p><span>another example:</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978053703?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978053703?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>Before I proceed further, I’ll have to warn you that this “array” is interchangeably called “matrix” or also “vector”. So don’t get panicked when I say for example “Matrix shape is 2 X 3”. All it means is that array looks something like this:</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978055981?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978055981?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>Now, Let’s talk about the shape of a default np array.</span></p>
<p><span>Shape is an attribute for np array. When a default array, say for example A is called with shape, here is how it looks.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978058980?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978058980?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>This is a rank 1 matrix(array), where it just has 9 elements in a row. </span></p>
<p><span>Ideally it should be a 1 X 9 matrix right?</span></p>
<p><span>I agree with you, so that’s where reshape() comes into play. It is a method that changes the dimensions of your original matrix into your desired dimension.</span></p>
<p><span>Let’s look at reshape in action. You can pass a tuple of whatever dimension you want as long as the reshaped matrix and original matrix have the same number of elements.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978062012?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978062012?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>Notice that reshape returns a multi-dim matrix. Two square brackets in the beginning indicate that. [[1, 2, 3, 4, 5, 6, 7, 8, 9]] is a potentially multi-dim matrix as opposed to [1, 2, 3, 4, 5, 6, 7, 8, 9].</span></p>
<p><span>Another example:</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978065829?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978065829?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>If I look at B’s shape, it’s going to be (3,3):</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3978070146?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3978070146?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><i>To read the rest of the article, click <a href="https://hackernoon.com/introduction-to-numpy-1-an-absolute-beginners-guide-to-machine-learning-and-data-science-5d87f13f0d51" target="_blank" rel="noopener">here</a>. More about Numpy can be found <a href="https://www.datasciencecentral.com/page/search?q=numpy" target="_blank" rel="noopener">here</a>. </i></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>Regularization in Machine Learningtag:www.datasciencecentral.com,2020-02-20:6448529:BlogPost:9328812020-02-20T15:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://towardsdatascience.com/@prashantgupta17?source=post_page-----76441ddcf99a----------------------" rel="noopener" target="_blank">Prashant Gupta</a></i><span><i>.</i></span></p>
<p></p>
<p><span>One of the major aspects of training your machine learning model is avoiding overfitting. The model will have a low accuracy if it is overfitting. This happens because your model is trying too hard to capture the noise in your training dataset. By noise…</span></p>
<p><i>This article was written by <a href="https://towardsdatascience.com/@prashantgupta17?source=post_page-----76441ddcf99a----------------------" target="_blank" rel="noopener">Prashant Gupta</a></i><span><i>.</i></span></p>
<p></p>
<p><span>One of the major aspects of training your machine learning model is avoiding overfitting. The model will have a low accuracy if it is overfitting. This happens because your model is trying too hard to capture the noise in your training dataset. By noise we mean the data points that don’t really represent the true properties of your data, but random chance. Learning such data points, makes your model more flexible, at the risk of overfitting.</span></p>
<p><span>The concept of balancing bias and variance, is helpful in understanding the phenomenon of overfitting.</span></p>
<p><span>One of the ways of avoiding overfitting is using cross validation, that helps in estimating the error over test set, and in deciding what parameters work best for your model.</span></p>
<p><span>This article will focus on a technique that helps in avoiding overfitting and also increasing model interpretability.</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/883/1*XC-8tHoMxrO3ogHKylRfRA.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/883/1*XC-8tHoMxrO3ogHKylRfRA.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Regularization</b></span></p>
<p><span>This is a form of regression, that constrains/ regularizes or shrinks the coefficient estimates towards zero. In other words, this technique discourages learning a more complex or flexible model, so as to avoid the risk of overfitting.</span></p>
<p><span>A simple relation for linear regression looks like this. Here Y represents the learned relation and β represents the coefficient estimates for different variables or predictors(X).</span></p>
<p><span>Y ≈ β0 + β1X1 + β2X2 + …+ βpXp</span></p>
<p><span>The fitting procedure involves a loss function, known as residual sum of squares or RSS. The coefficients are chosen, such that they minimize this loss function.</span></p>
<p><span>Now, this will adjust the coefficients based on your training data. If there is noise in the training data, then the estimated coefficients won’t generalize well to the future data. This is where regularization comes in and shrinks or regularizes these learned estimates towards zero.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Ridge Regression</b></span></p>
<p><span>Now, the coefficients are estimated by minimizing this function. Here, λ is the tuning parameter that decides how much we want to penalize the flexibility of our model. The increase in flexibility of a model is represented by increase in its coefficients, and if we want to minimize the above function, then these coefficients need to be small. This is how the Ridge regression technique prevents coefficients from rising too high. Also, notice that we shrink the estimated association of each variable with the response, except the intercept β0, This intercept is a measure of the mean value of the response when xi1 = xi2 = …= xip = 0.</span></p>
<p>When λ = 0, the penalty term has no eﬀect, and the estimates produced by ridge regression will be equal to least squares. However, as λ→∞, the impact of the shrinkage penalty grows, and the ridge regression coeﬃcient estimates will approach zero. As can be seen, selecting a good value of λ is critical. Cross validation comes in handy for this purpose. The coefficient estimates produced by this method are also known as the L2 norm.</p>
<p><span>The coefficients that are produced by the standard least squares method are scale equivariant, i.e. if we multiply each input by c then the corresponding coefficients are scaled by a factor of 1/c. Therefore, regardless of how the predictor is scaled, the multiplication of predictor and coefficient(Xjβj) remains the same. However, this is not the case with ridge regression, and therefore, we need to standardize the predictors or bring the predictors to the same scale before performing ridge regression.</span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Lasso</b></span></p>
<p><span>Lasso is another variation, in which the above function is minimized. Its clear that this variation differs from ridge regression only in penalizing the high coefficients. It uses |βj|(modulus)instead of squares of β, as its penalty. In statistics, this is known as the L1 norm.</span></p>
<p><span>Lets take a look at above methods with a different perspective. The ridge regression can be thought of as solving an equation, where summation of squares of coefficients is less than or equal to s. And the Lasso can be thought of as an equation where summation of modulus of coefficients is less than or equal to s. Here, s is a constant that exists for each value of shrinkage factor λ. These equations are also referred to as constraint functions.</span></p>
<p><span>Consider their are 2 parameters in a given problem. Then according to above formulation, the ridge regression is expressed by β1² + β2² ≤ s. This implies that ridge regression coefficients have the smallest RSS(loss function) for all points that lie within the circle given by β1² + β2² ≤ s.</span></p>
<p><span>Similarly, for lasso, the equation becomes,|β1|+|β2|≤ s. This implies that lasso coefficients have the smallest RSS(loss function) for all points that lie within the diamond given by |β1|+|β2|≤ s.</span></p>
<p></p>
<p><i>To read the whole article, with illustrations, click <a href="https://towardsdatascience.com/regularization-in-machine-learning-76441ddcf99a" target="_blank" rel="noopener">here</a>. For another article on the same topic, follow <a href="https://www.datasciencecentral.com/profiles/blogs/regularization-in-machine-learning" target="_blank" rel="noopener">this link</a>. </i></p>
<p></p>Text Classification using Neural Networkstag:www.datasciencecentral.com,2020-02-09:6448529:BlogPost:9298582020-02-09T21:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://machinelearnings.co/@gk_?source=post_page-----f5cd7b8765c6----------------------" rel="noopener" target="_blank">gk_</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Understanding how chatbots work is important. A fundamental piece of machinery inside a chat-bot is the text classifier. Let’s look at the inner workings of an artificial neural network (ANN) for text classification.</span></p>
<p><span>We’ll use 2 layers of neurons (1 hidden…</span></p>
<p><i>This article was written by <a href="https://machinelearnings.co/@gk_?source=post_page-----f5cd7b8765c6----------------------" target="_blank" rel="noopener">gk_</a></i><span><i>.</i></span></p>
<p></p>
<p><span>Understanding how chatbots work is important. A fundamental piece of machinery inside a chat-bot is the text classifier. Let’s look at the inner workings of an artificial neural network (ANN) for text classification.</span></p>
<p><span>We’ll use 2 layers of neurons (1 hidden layer) and a “bag of words” approach to organizing our training data. Text classification comes in 3 flavors: pattern matching, algorithms, neural nets. While the algorithmic approach using Multinomial Naive Bayes is surprisingly effective, it suffers from 3 fundamental flaws:</span></p>
<ul>
<li><span>the algorithm produces a score rather than a probability.</span> <span>We want a probability to ignore predictions below some threshold. This is akin to a ‘squelch’ dial on a VHF radio.</span></li>
<li><span>the algorithm ‘learns’ from examples of what is in a class, but not what isn’t. This learning of patterns of what does <i>not</i> belong to a class is often very important.</span></li>
<li><span>classes with disproportionately large training sets can create distorted classification scores, forcing the algorithm to adjust scores relative to class size. This is not ideal.</span></li>
</ul>
<p></p>
<p><span><a href="https://miro.medium.com/max/537/1*DpMaU1p85ZSgamwYDkzL-A.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/537/1*DpMaU1p85ZSgamwYDkzL-A.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><span>Join <b>3</b>0,000+ people who read the weekly Machine Learnings</span> <span>newsletter to understand how AI will impact the way they work and live.</span></p>
<p><span>As with its ‘Naive’ counterpart, this classifier isn’t attempting to understand the meaning of a sentence, it’s trying to classify it. In fact so called “AI chat-bots” do not understand language, but that’s another story.</span></p>
<p><span>If you are new to artificial neural networks, here is how they work.</span></p>
<p><span>To understand an algorithm approach to classification, see here.</span></p>
<p><span>Let’s examine our text classifier one section at a time. We will take the following steps:</span></p>
<ul>
<li><span>refer to libraries we need</span></li>
<li><span>provide training data</span></li>
<li><span>organize our data</span></li>
<li><span>iterate: code + test the results + tune the model</span></li>
<li><span>abstract</span></li>
</ul>
<p><span>The code is here, we’re using iPython notebook which is a super productive way of working on data science projects. The code syntax is Python.</span></p>
<p><span>We begin by importing our natural language toolkit. We need a way to reliably tokenize sentences into words and a way to stem words.</span><span><i><span class="Apple-converted-space"> </span></i></span></p>
<p></p>
<p><i>To read the whole article, with demonstration, click <a href="https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>The 10 Deep Learning Methods AI Practitioners Need to Applytag:www.datasciencecentral.com,2020-02-09:6448529:BlogPost:9300062020-02-09T21:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://medium.com/@james_aka_yale?source=post_page-----885259f402c1----------------------" rel="noopener" target="_blank">James Le</a></i><span><i>.</i></span></p>
<p><span>Neural networks are one type of model for machine learning; they have been around for at least 50 years. The fundamental unit of a neural network is a node, which is loosely based on the biological neuron in the mammalian brain. The connections between neurons are also modeled on…</span></p>
<p><i>This article was written by <a href="https://medium.com/@james_aka_yale?source=post_page-----885259f402c1----------------------" target="_blank" rel="noopener">James Le</a></i><span><i>.</i></span></p>
<p><span>Neural networks are one type of model for machine learning; they have been around for at least 50 years. The fundamental unit of a neural network is a node, which is loosely based on the biological neuron in the mammalian brain. The connections between neurons are also modeled on biological brains, as is the way these connections develop over time (with “training”).</span></p>
<p><span>In the mid-1980s and early 1990s, many important architectural advancements were made in neural networks. However, the amount of time and data needed to get good results slowed adoption, and thus interest cooled. In the early 2000s, computational power expanded exponentially and the industry saw a “Cambrian explosion” of computational techniques that were not possible prior to this. Deep learning emerged from that decade’s explosive computational growth as a serious contender in the field, winning many important machine learning competitions. The interest has not cooled as of 2017; today, we see deep learning mentioned in every corner of machine learning.</span></p>
<p></p>
<p><span><a href="https://miro.medium.com/max/729/1*gRAmdOHLaf-AfgyBfsa0JQ.png" target="_blank" rel="noopener"><img src="https://miro.medium.com/max/729/1*gRAmdOHLaf-AfgyBfsa0JQ.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><span>Most recently, I have started reading academic papers on the subject. From my research, here are several publications that have been hugely influential to the development of the field:</span></p>
<ul class="">
<li id="454d" class="hm hn es at ho b hp hq hr hs ht hu hv hw hx hy hz ip iq ir">NYU’s <a href="http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf" class="cz gz ib ic id ie" target="_blank" rel="noopener nofollow">Gradient-Based Learning Applied to Document Recognition</a> (1998), which introduces Convolutional Neural Network to the Machine Learning world.</li>
<li id="337f" class="hm hn es at ho b hp is hr it ht iu hv iv hx iw hz ip iq ir">Toronto’s <a href="http://proceedings.mlr.press/v5/salakhutdinov09a/salakhutdinov09a.pdf" class="cz gz ib ic id ie" target="_blank" rel="noopener nofollow">Deep Boltzmann Machines</a> (2009), which presents a new learning algorithm for Boltzmann machines that contain many layers of hidden variables.</li>
<li id="d55f" class="hm hn es at ho b hp is hr it ht iu hv iv hx iw hz ip iq ir">Stanford & Google’s <a href="http://icml.cc/2012/papers/73.pdf" class="cz gz ib ic id ie" target="_blank" rel="noopener nofollow">Building High-Level Features Using Large-Scale Unsupervised Learning</a> (2012), which addresses the problem of building high-level, class-specific feature detectors from only unlabeled data.</li>
<li id="5578" class="hm hn es at ho b hp is hr it ht iu hv iv hx iw hz ip iq ir">Berkeley’s <a href="http://proceedings.mlr.press/v32/donahue14.pdf" class="cz gz ib ic id ie" target="_blank" rel="noopener nofollow">DeCAF — A Deep Convolutional Activation Feature for Generic Visual Recognition</a> (2013), which releases DeCAF, an open-source implementation of the deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.</li>
<li id="fe37" class="hm hn es at ho b hp is hr it ht iu hv iv hx iw hz ip iq ir">DeepMind’s <a href="https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf" class="cz gz ib ic id ie" target="_blank" rel="noopener nofollow">Playing Atari with Deep Reinforcement Learning</a> (2016), which presents the 1st deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.</li>
</ul>
<p><span>The field of AI is broad and has been around for a long time. Deep learning is a subset of the field of machine learning, which is a subfield of AI. The facets that differentiate deep learning networks in general from “canonical” feed-forward multilayer networks are as follows:</span></p>
<ul>
<li><span>More neurons than previous networks</span></li>
<li><span>More complex ways of connecting layers</span></li>
<li><span>“Cambrian explosion” of computing power to train</span></li>
<li><span>Automatic feature extraction</span></li>
</ul>
<p><span>When I say “more neurons”, I mean that the neuron count has risen over the years to express more complex models. Layers also have evolved from each layer being fully connected in multilayer networks to locally connected patches of neurons between layers in Convolutional Neural Networks and recurrent connections to the same neuron in Recurrent Neural Networks (in addition to the connections from the previous layer).</span></p>
<p><span>Deep learning then can be defined as neural networks with a large number of parameters and layers in one of four fundamental network architectures:</span></p>
<ul>
<li><span>Unsupervised Pre-trained Networks</span></li>
<li><span>Convolutional Neural Networks</span></li>
<li><span>Recurrent Neural Networks</span></li>
<li><span>Recursive Neural Networks</span></li>
</ul>
<p><span>In this post, I am mainly interested in the latter 3 architectures. A Convolutional Neural Network is basically a standard neural network that has been extended across space using shared weights. CNN is designed to recognize images by having convolutions inside, which see the edges of an object recognized on the image. A Recurrent Neural Network is basically a standard neural network that has been extended across time by having edges which feed into the next time step instead of into the next layer in the same time step. RNN is designed to recognize sequences, for example, a speech signal or a text. It has cycles inside that implies the presence of short memory in the net. A Recursive Neural Network is more like a hierarchical network where there is really no time aspect to the input sequence but the input has to be processed hierarchically in a tree fashion. The 10 methods below can be applied to all of these architectures.</span></p>
<p><span style="font-size: 12pt;"><strong>10 Deep Learning Methods</strong></span></p>
<ul>
<li>Back-Propagation</li>
<li>Stochastic Gradient Descent</li>
<li>Learning Rate Decay</li>
<li>Dropout</li>
<li>Max Pooling</li>
<li>Batch Normalization</li>
<li>Long Short-Term Memory</li>
<li>Skip-gram</li>
<li>Continuous Bag Of Words</li>
<li>Transfer Learning</li>
</ul>
<p><i>To read the full article, with illustrations, click <a href="https://medium.com/cracking-the-data-science-interview/the-10-deep-learning-methods-ai-practitioners-need-to-apply-885259f402c1" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>Scikit-learn Classification Algorithmstag:www.datasciencecentral.com,2020-02-01:6448529:BlogPost:9275232020-02-01T18:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by Matthew Mayo</i><span><i>.</i></span></p>
<p></p>
<p><span>Scikit-learn is the de facto official machine learning library in use in the Python ecosystem. As described on its official website, Scikit-learn is:</span></p>
<ul>
<li><span>Simple and efficient tools for data mining and data analysis</span></li>
<li><span>Accessible to everybody, and reusable in various contexts</span></li>
<li><span>Built on NumPy, SciPy, and matplotlib</span></li>
<li><span>Open…</span></li>
</ul>
<p><i>This article was written by Matthew Mayo</i><span><i>.</i></span></p>
<p></p>
<p><span>Scikit-learn is the de facto official machine learning library in use in the Python ecosystem. As described on its official website, Scikit-learn is:</span></p>
<ul>
<li><span>Simple and efficient tools for data mining and data analysis</span></li>
<li><span>Accessible to everybody, and reusable in various contexts</span></li>
<li><span>Built on NumPy, SciPy, and matplotlib</span></li>
<li><span>Open source, commercially usable - BSD license</span><span><span class="Apple-converted-space"> </span></span></li>
</ul>
<p><span><span class="Apple-converted-space"><a href="https://github.com/mmmayo13/scikit-learn-classifiers/raw/e5a18300ce41fadde562cf061316eb131b8e5147/img/plot_classifier_comparison.jpg" target="_blank" rel="noopener"><img src="https://github.com/mmmayo13/scikit-learn-classifiers/raw/e5a18300ce41fadde562cf061316eb131b8e5147/img/plot_classifier_comparison.jpg?profile=RESIZE_710x" class="align-full"/></a></span></span><span>This tutorial is meant to serve as a demonstration of several machine learning classifiers, and { is inspired by | references | incoporates techniques from } the following excellent works:</span></p>
<ul>
<li><span>Randal Olson's An Example Machine Learning Notebook</span></li>
<li><span>Analytics Vidhya's Common Machine Learning Algorithms Cheat Sheet</span></li>
<li><span>Scikit-learn's official Cross-validation Documentation</span></li>
<li><span>Scikit-learn's official Iris Dataset Documentation</span></li>
<li><span>Likely includes influence of the various referenced tutorials included in this KDnuggets Python Machine Learning article I recently wrote</span></li>
</ul>
<p><span>We will use the well-known Iris and Digits datasets to build models with the following machine learning classification algorithms:</span></p>
<ul>
<li><span>Logistic Regression</span></li>
<li><span>Decision Tree</span></li>
<li><span>Support Vector Machine</span></li>
<li><span>Naive Bayes</span></li>
<li><span>k-nearest Neighbors</span></li>
<li><span>Random Forests</span></li>
</ul>
<p><span>We also use different strategies for evaluating models:</span></p>
<ul>
<li><span>Separate testing and training datasets</span></li>
<li><span>k-fold Cross-validation</span></li>
</ul>
<p><span>Some simple data investigation methods and tools will be undertaken as well, including:</span></p>
<ul>
<li><span>Plotting data with Matplotlib</span></li>
<li><span>Building and data via Pandas dataframes</span></li>
<li><span>Constructing and operating on multi-dimensional arrays and matrices with Numpy</span></li>
</ul>
<p><span>This tutorial is brief, non-verbose, and to the point. Please alert me if you find inaccuracies. Also, if you find it at all useful, and believe it to be worth doing so, please feel free to share it far and wide.</span></p>
<p><i>To read the tutorial, with the demonstration, click <a href="https://github.com/mmmayo13/scikit-learn-classifiers/blob/master/sklearn-classifiers-tutorial.ipynb" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>Where have you seen Machine Learning in your everyday life?tag:www.datasciencecentral.com,2020-01-03:6448529:BlogPost:9198452020-01-03T16:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article is on the blog</i> <span><i>artificialintelligenceml.</i></span></p>
<p><span><i><a href="https://storage.ning.com/topology/rest/1.0/file/get/3977666819?profile=original" rel="noopener" target="_blank"><img class="align-center" src="https://storage.ning.com/topology/rest/1.0/file/get/3977666819?profile=RESIZE_710x"></img></a></i></span></p>
<p>This article features the following applications, one of them is pictured above (recommendation engine).</p>
<ul>
<li>Google’s AI-Powered Predictions</li>
<li>Ridesharing Apps Like Uber and Lyft</li>
<li>Commercial Flights Use an AI…</li>
</ul>
<p><i>This article is on the blog</i> <span><i>artificialintelligenceml.</i></span></p>
<p><span><i><a href="https://storage.ning.com/topology/rest/1.0/file/get/3977666819?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3977666819?profile=RESIZE_710x" class="align-center"/></a></i></span></p>
<p>This article features the following applications, one of them is pictured above (recommendation engine).</p>
<ul>
<li>Google’s AI-Powered Predictions</li>
<li>Ridesharing Apps Like Uber and Lyft</li>
<li>Commercial Flights Use an AI Autopilot</li>
<li>Spam Filters</li>
<li>Smart Email Categorization</li>
<li>Plagiarism Checkers</li>
<li>Robo-readers to Grade Essays</li>
<li>Mobile Check Deposits</li>
<li>Fraud Prevention</li>
<li>Credit Decisions</li>
<li>Image Recognotion on Social Networks</li>
<li>Search Optimization for Online Catalogs</li>
<li>Recommendations Engines (Yelp, Amazon)</li>
<li>Voice to Text Translation on Mobile Phones</li>
</ul>
<p><em>Read the full article with illustrations for each application, <a href="https://artificialintelligenceml.blogspot.com/2017/11/where-have-you-seen-machine-learning-in.html" target="_blank" rel="noopener">here</a>. </em></p>
<p></p>
<p></p>Everything You Need to Know About Google Brain’s TensorFlowtag:www.datasciencecentral.com,2020-01-03:6448529:BlogPost:9197462020-01-03T16:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://beebom.com/author/thurana/" rel="noopener" target="_blank">Jeffry Thurana</a></i><span><i>.</i></span></p>
<p><span>Anybody who has tried Google Photos would agree that this free photo storage and management service from Google is smart. It packs in various smart features like advanced search, ability to categorize your pictures by locations and dates, automatically create albums and videos based on similarities, and walk you down the memory…</span></p>
<p><i>This article was written by <a href="https://beebom.com/author/thurana/" target="_blank" rel="noopener">Jeffry Thurana</a></i><span><i>.</i></span></p>
<p><span>Anybody who has tried Google Photos would agree that this free photo storage and management service from Google is smart. It packs in various smart features like advanced search, ability to categorize your pictures by locations and dates, automatically create albums and videos based on similarities, and walk you down the memory lane by showing you photos of the same day several years ago. There are many things Google Photos can do that several years ago would be machine-ly impossible. Google Photos is one of the many “smart” services from Google that uses a machine learning technology called TensorFlow. The word learning indicates that the technology will get smarter by time to the point that our current knowledge cannot imagine. But what is TensorFlow? How can a machine learn? What can you do with it? Let’s find out.</span><span><a href="https://beebom.com/wp-content/uploads/2016/07/Tensor-bb-tensor-flowing.jpg" target="_blank" rel="noopener"><img src="https://beebom.com/wp-content/uploads/2016/07/Tensor-bb-tensor-flowing.jpg?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p><strong>Content of the article</strong></p>
<p>1. What is TensorFlow?</p>
<p>2. How Does TensorFlow Work?</p>
<p>3. Tensor Processing Unit (TPU) ASIC chip</p>
<p>4. The Future of TensorFlow</p>
<p>5. Applications of TensorFlow</p>
<ul>
<li>More on Image Analysis</li>
<li>Speech Recognition</li>
<li>Dynamic Translation</li>
<li>Alpha Go</li>
<li>Magenta Project</li>
</ul>
<p><i>To read the whole article, click <a href="https://beebom.com/google-brains-tensorflow/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>Essentials of Deep Learning : Introduction to Long Short Term Memorytag:www.datasciencecentral.com,2020-01-01:6448529:BlogPost:9191612020-01-01T13:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="https://www.analyticsvidhya.com/blog/author/pranj52/" rel="noopener" target="_blank">Pranjal Srivastava</a></i><span><i>.</i></span></p>
<p><span>Sequence prediction problems have been around for a long time. They are considered as one of the hardest problems to solve in the data science industry. These include a wide range of problems; from predicting sales to finding patterns in stock markets’ data, from understanding movie plots to recognizing your…</span></p>
<p><i>This article was written by <a href="https://www.analyticsvidhya.com/blog/author/pranj52/" target="_blank" rel="noopener">Pranjal Srivastava</a></i><span><i>.</i></span></p>
<p><span>Sequence prediction problems have been around for a long time. They are considered as one of the hardest problems to solve in the data science industry. These include a wide range of problems; from predicting sales to finding patterns in stock markets’ data, from understanding movie plots to recognizing your way of speech, from language translations to predicting your next word on your iPhone’s keyboard.</span></p>
<p><span>With the recent breakthroughs that have been happening in data science, it is found that for almost all of these sequence prediction problems, Long short Term Memory networks, a.k.a LSTMs have been observed as the most effective solution.</span></p>
<p><span>LSTMs have an edge over conventional feed-forward neural networks and RNN in many ways. This is because of their property of selectively remembering patterns for long durations of time. The purpose of this article is to explain LSTM and enable you to use it in real life problems. Let’s have a look!</span></p>
<p><span>Note: To go through the article, you must have basic knowledge of neural networks and how Keras (a deep learning library) works.</span></p>
<p></p>
<p><span><a href="https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2017/12/10131302/13-768x295.png" target="_blank" rel="noopener"><img src="https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2017/12/10131302/13-768x295.png?profile=RESIZE_710x" class="align-full"/></a></span></p>
<p></p>
<p><strong>Table of Contents</strong></p>
<ol>
<li><span>Flashback: A look into Recurrent Neural Networks (RNN)</span></li>
<li><span>Limitations of RNNs</span></li>
<li><span>Improvement over RNN : Long Short Term Memory (LSTM)</span></li>
<li><span>Architecture of LSTM</span><ol>
<li><span>Forget Gate</span></li>
<li><span>Input Gate</span></li>
<li><span>Output Gate</span></li>
</ol>
</li>
<li><span>Text generation using LSTMs.</span></li>
</ol>
<p><i>To read the rest of the article, with illustrations, click <a href="https://www.analyticsvidhya.com/blog/2017/12/fundamentals-of-deep-learning-introduction-to-lstm/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>A Majority of Data Scientists Lack Competency in Advanced Machine Learning Areas and Techniquestag:www.datasciencecentral.com,2019-12-23:6448529:BlogPost:9172352019-12-23T22:00:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><i>This article was written by <a href="http://businessoverbroadway.com/author/bobhayes/" rel="noopener" target="_blank">Bob Hayes</a></i><span><i>.</i></span></p>
<p><span>Data science requires the effective application of skills in a variety of machine learning areas and techniques. A recent survey by Kaggle, however, revealed that a limited number of data professionals possess competency in advanced machine learning skills. About half of data professionals said they were competent in…</span></p>
<p><i>This article was written by <a href="http://businessoverbroadway.com/author/bobhayes/" target="_blank" rel="noopener">Bob Hayes</a></i><span><i>.</i></span></p>
<p><span>Data science requires the effective application of skills in a variety of machine learning areas and techniques. A recent survey by Kaggle, however, revealed that a limited number of data professionals possess competency in advanced machine learning skills. About half of data professionals said they were competent in supervised machine learning (49%) and logistic regression (53%). Deep learning techniques were among the ML skills with the lowest competency rates: Neural Networks – GAN (7%); NN – RNNs (15%) and NN – CNNs (26%).</span></p>
<p><span>A majority of enterprises (80%) have some form of artificial intelligence (machine learning, deep learning) in production today. Additionally, about a third of enterprises are planning on expanding their AI efforts over the next 36 months. But who will lead these data science projects? Who will do the work? Some researchers suggest there is a lack of AI talent needed to fill those roles. Tencent estimates there are only 300,000 AI researchers and practitioners worldwide. ElementAI estimates there are 22,000 PhD-level researchers working in AI.</span></p>
<p><span>Kaggle conducted a survey in August 2017 of over 16,000 data professionals (2017 State of Data Science and Machine Learning). The survey asked respondents about their competence across a variety of AI-related approaches and techniques. Looking at different AI skills will give us a more detailed look into the specific AI skills that are driving this talent gap.</span></p>
<p><span><a href="https://storage.ning.com/topology/rest/1.0/file/get/3793138354?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3793138354?profile=RESIZE_710x" class="align-center"/></a></span></p>
<p></p>
<p><span style="font-size: 14pt;"><b>Competency in Machine Learning Areas</b></span></p>
<p><span>All respondents (employed or not) were were given a list of 13 machine learning areas and asked to indicate in which areas they consider themselves competent. The top 10 machine learning areas in which data professionals are competent were:</span></p>
<ul>
<li> <span>Supervised Machine Learning (49%)</span></li>
<li> <span>Unsupervised Learning (26%)</span></li>
<li> <span>Time Series (25%)</span></li>
<li> <span>Natural Language Processing (19%)</span></li>
<li> <span>Outlier detection (16%)</span></li>
<li> <span>Computer Vision (15%)</span></li>
<li> <span>Recommendation Engines (14%)</span></li>
<li> <span>Survival Analysis (8%)</span></li>
<li> <span>Reinforcement Learning (6%)</span></li>
<li> <span>Adversarial Learning (4%)</span></li>
</ul>
<p><span style="font-size: 14pt;"><b>Competency in Machine Learning Techniques</b></span></p>
<p><span>The survey included a question for all data professionals, employed or not, regarding their competency in 13 machine learning techniques (In which areas of machine learning do you consider yourself competent? (Select all that apply).) The top 10 machine learning techniques in which data pros are competent were :</span></p>
<ul>
<li> <span>Logistic Regression (54%)</span></li>
<li> <span>Decision Trees – Random Forests (43%)</span></li>
<li> <span>Support Vector Machines (32%)</span></li>
<li> <span>Decision Trees – Gradient Boosted Machines (31%)</span></li>
<li> <span>Bayesian Techniques (27%)</span></li>
<li> <span>Neural Networks – CNNs (26%)</span></li>
<li> <span>Ensemble Methods (22%)</span></li>
<li> <span>Gradient Boosting (17%)</span></li>
<li> <span>Neural Networks – RNNs (15%)</span></li>
<li> <span>Hidden Markov Models HMMs (9%)</span></li>
</ul>
<p><i>To read the whole article, with illustrations, click <a href="http://businessoverbroadway.com/2018/02/18/a-majority-of-data-scientists-lack-competency-in-advanced-machine-learning-areas-and-techniques/" target="_blank" rel="noopener">here</a>.</i></p>
<p></p>
<p></p>Regression analysis using Pythontag:www.datasciencecentral.com,2019-12-08:6448529:BlogPost:9133752019-12-08T15:30:00.000ZAndrea Manero-Bastinhttps://www.datasciencecentral.com/profile/AndreaManeroBastin
<p><em><span>This article was written by Stuart Reid.</span></em> </p>
<p><em> </em></p>
<p>This tutorial covers regression analysis using the Python StatsModels package with Quandl integration. For motivational purposes, here is what we are working towards: a regression analysis program which receives multiple data-set names from Quandl.com, automatically downloads the data, analyses it, and plots the results in a new window.…</p>
<p></p>
<p></p>
<p><em><span>This article was written by Stuart Reid.</span></em> </p>
<p><em> </em></p>
<p>This tutorial covers regression analysis using the Python StatsModels package with Quandl integration. For motivational purposes, here is what we are working towards: a regression analysis program which receives multiple data-set names from Quandl.com, automatically downloads the data, analyses it, and plots the results in a new window.</p>
<p></p>
<p><a href="http://www.turingfinance.com/wp-content/uploads/2014/07/Generlized-Linear-Mixed-Model.png" target="_blank" rel="noopener"><img src="http://www.turingfinance.com/wp-content/uploads/2014/07/Generlized-Linear-Mixed-Model.png?profile=RESIZE_710x" class="align-full"/></a></p>
<p><strong> </strong></p>
<p><span style="font-size: 14pt;"><strong>TYPES OF REGRESSION ANALYSIS</strong></span></p>
<p><strong>Linear regression</strong> analysis fits a straight line to some data in order to capture the linear relationship between that data. The regression line is constructed by optimizing the parameters of the straight line function such that the line best fits a sample of (x, y) observations where y is a variable dependent on the value of x. Regression analysis is used extensively in economics, risk management, and trading. One cool application of regression analysis is in calibrating certain stochastic process models such as the Ornstein Uhlenbeck stochastic process. </p>
<p><strong>Non-linear regression</strong> analysis uses a curved function, usually a polynomial, to capture the non-linear relationship between the two variables. The regression is often constructed by optimizing the parameters of a higher-order polynomial such that the line best fits a sample of (x, y) observations. In the article, Ten Misconceptions about Neural Networks in Finance and Trading, it is shown that a neural network is essentially approximating a multiple non-linear regression function between the inputs into the neural network and the outputs.</p>
<p>The case for linear vs. non-linear regression analysis in finance remains open. The issue with linear models is that they often under-fit and may also assert assumptions on the variables and the main issue with non-linear models is that they often over-fit. Training and data-preparation techniques can be used to minimize over-fitting.</p>
<p>A multiple linear regression analysis is a used for predicting the values of a set of dependent variables, Y, using two or more sets of independent variables e.g. X1, X2, ..., Xn. E.g. you could try to forecast share prices using one fundamental indicator like the PE ratio, or you could used multiple indicators together like the PE, DY, DE ratios, and the share's EPS. Interestingly there is almost no difference between a multiple linear regression and a perceptron (also known as an artificial neuron, the building blocks of neural networks). Both are calculated as the weighted sum of the input vector plus some constant or bias which is used to shift the function. The only difference is that the input signal into the perceptron is fed into an activation function which is often non-linear.</p>
<p>If the objective of the multiple linear regression is to classify patterns between different classes and not regress a quantity then another approach is to make use of clustering algorithms. Clustering is particularly useful when the data contains multiple classes and more than one linear relationship. Once the data set has been partitioned further regression analysis can be performed on each class. Some useful clustering algorithms are the K-Means Clustering Algorithm and one of my favourite computational intelligence algorithms, Ant Colony Optimization.</p>
<p>The image below shows how the K-Means clustering algorithm can be used to partition data into clusters (classes). Regression can then be performed on each class individually.</p>
<p><a href="https://storage.ning.com/topology/rest/1.0/file/get/3761148739?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/3761148739?profile=RESIZE_710x" class="align-center"/></a></p>
<p><strong>Logistic Regression Analysis</strong> - linear regressions deal with continuous valued series whereas a logistic regression deals with categorical (discrete) values. Discrete values are difficult to work with because they are non differentiable so gradient-based optimization techniques don't apply.</p>
<p><strong>Stepwise Regression Analysis</strong> - this is the name given to the iterative construction of a multiple regression model. It works by automatic selecting statistically significant independent variables to include in the regression analysis. This is achieved either by either growing or pruning the variables included in the regression analysis.</p>
<p>Many other regression analyses exist, and in particular, mixed models are worth mentioning here. Mixed models is is an extension to the generalized linear model in which the linear predictor contains random effects in addition to the usual fixed effects. This decision tree can be used to help determine the right components for a model.</p>
<p><em> </em></p>
<p><em>To read the whole article, with illustrations, click <a href="http://www.turingfinance.com/regression-analysis-using-python-statsmodels-and-quandl/" target="_blank" rel="noopener">here</a>.</em></p>
<p> </p>