Rohan Kotwani's Posts - Data Science Central2020-11-24T00:55:31ZRohan Kotwanihttps://www.datasciencecentral.com/profile/RohanKotwanihttps://storage.ning.com/topology/rest/1.0/file/get/3680623466?profile=RESIZE_48X48&width=48&height=48&crop=1%3A1https://www.datasciencecentral.com/profiles/blog/feed?user=0ykouy80vk2q2&xn_auth=noNew Algorithm For Density Estimation and Noise Reductiontag:www.datasciencecentral.com,2020-04-09:6448529:BlogPost:9441162020-04-09T06:30:00.000ZRohan Kotwanihttps://www.datasciencecentral.com/profile/RohanKotwani
<p><span style="font-size: 2em;">KernelML - Hierarchical Density Factorization</span></p>
<br></br>
<p class="post-meta">The purpose, problem statement, and potential applications came from <a href="https://www.datasciencecentral.com/profiles/blogs/decomposition-of-statistical-distributions-using-mixture-models-a">this</a> post on datasciencecentral.com. The goal is to approximate any multi-variate distribution using a weighted sum of kernels. Here, a kernel refers to a parameterized distribution.…</p>
<p><span style="font-size: 2em;">KernelML - Hierarchical Density Factorization</span></p>
<br/>
<p class="post-meta">The purpose, problem statement, and potential applications came from <a href="https://www.datasciencecentral.com/profiles/blogs/decomposition-of-statistical-distributions-using-mixture-models-a">this</a> post on datasciencecentral.com. The goal is to approximate any multi-variate distribution using a weighted sum of kernels. Here, a kernel refers to a parameterized distribution. This method of using a decaying weighted sum of kernels to approximate a distribution is similar to a <a href="https://en.wikipedia.org/wiki/Taylor_series">Taylor series</a> where a function can be approximated, around a point, using the function’s derivatives. KernelML is a particle optimizer that uses parameter constraints and sampling methods to minimize a customizable loss function. The package uses a Cythonized backend and parallelizes operations across multiple cores with the Numba. KernelML is now available on the Anaconda cloud and PyPi (pip). Please see the KernelML <a href="https://github.com/Freedomtowin/kernelml/">extention</a> on the documentation page.</p>
<br/>
<br/>
<h1>Goals</h1>
<ul>
<li>Approximate any empirical distribution</li>
<li>Build a parameterized density estimator</li>
<li>Outlier detection and dataset noise reduction</li>
</ul>
<h1>My Approach</h1>
<p>This solution I came up with was incorporated into a python package, KernelML. The example code can be found <a href="https://github.com/freedomtowin/kernelml/blob/master/kernelml-hierarchical-density-factorization.ipynb">here</a>.</p>
<p>My solution uses the following:</p>
<ol>
<li>Particle Swarm\Genetic Optimizer</li>
<li>Multi-Agent Approximation using IID Kernels</li>
<li>Reinforcement Learning</li>
</ol>
<h1>Particle Swarm\Genetic Optimizer</h1>
<p>Most kernels have hyper-parameters that control the mean and variation of the distribution. While these parameters are potentially differentiable, I decided against using a gradient-based method. The gradients for the variance parameters can potentially vanish, and constraining the variance makes the parameters non-differentiable. It makes sense to use a mixed integer or particle swarm strategy to optimize the kernels’ hyper-parameters. I decided to use a uniform distribution kernel because of its robustness to outliers in higher dimensions.</p>
<p>Over the past year, I’ve independently developed an optimization algorithm to solve non-linear, constrained optimization problems. It is by no means perfect, but building it from scratch allowed me to 1) make modifications based on the task 2) better understand the problem I was trying to solve.</p>
<h1>Multi-Agent Approximation using IID Kernels</h1>
<p>My initial approach used a multi-agent strategy to simultaneously fit any multi-variate distribution. The agents, in this case, the kernels, were independent and identically distributed (IID). I made an algorithm, called density factorization, to fit an arbitrary number of agents to a distribution. The optimization approach and details can be found<span> </span><a href="https://freedomtowin.github.io/2019/02/19/KernelML-Density-Factorization.html">here</a>. The video below shows a frame-by-frame example for how the solution might look over the optimization procedure.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/W6Ey0i7ayBc?wmode=opaque" frameborder="0" allowfullscreen=""></iframe>
<p>This algorithm seemed to perform well on non-sparse, continuous distributions. One problem was that the algorithm used IID kernels which is an issue when modeling skewed data. Every kernel has the same 1/K weight, where K is the number of kernels. In theory, hundreds of kernels could be optimized at once, but this solution lacked efficiency and granularity.</p>
<h1>Reinforcement Learning</h1>
<p>I decided to use a hierarchical, reinforcement style approach to fitting the empirical multi-variate distribution. The initial reward, R_0, was the empirical distribution, and the discounted reward, R_1, represented the data points not captured by the initial multi-agent algorithm at R_0. Equation (1) shows the update process for the reward.</p>
<img src="https://freedomtowin.github.io/hdf_images/output_0.png"/><br/>
<br/>
<br/>
<p>Where p is the percentage of unassigned data points, R is the empirical distribution at step t, U is the empirical distribution for the unassigned data points, and lambda is the discount factor. The reward update is the multiplication of p and lambda.</p>
<p>This works because, by definition, the space between data points increases as the density decreases. As data points in under-populated regions are clustered, the cluster sizes will increase to capture new data points. The reward update is less than the percentage of unassigned data point which allows the denser regions to be represented multiple times before moving to the less dense regions.</p>
<p>The algorithm uses pre-computed rasters to approximate the empirical density which means that the computational complexity depends on the number of dimensions, not the data points. The example below shows how the estimated and empirical distribution might look for the 2-D case.</p>
<img src="https://freedomtowin.github.io/hdf_images/output_1.png"/><br/>
<br/>
<br/>
<p>After fitting the initial density factorization algorithm, the reward is updated by some discount factor to improve the reward for the data points that have not been captured. The plot below shows how the empirical distribution might look after a few updates.</p>
<img src="https://freedomtowin.github.io/hdf_images/output_2.png"/><br/>
<br/>
<br/>
<p>The samples in each cluster must be greater than the minimum-leaf-sample parameter. This parameter prevents clusters from accidentally modeling outliers by chance. This is mostly an issue in higher dimensional space due to the curse of dimensionality. If a cluster does not meet this constraint, it is pruned from the cluster solution. This process continues until 1) a new cluster solution does not capture new data points or 2) >99% of the data points have been captured (this threshold is also adjustable).</p>
<h1>Example</h1>
<p>As the input space increases in dimensionality, the Euclidean space between data points increases. For example, for an input space that contains uniform random variables, the space between space points increases by a factor of sqrt(D), where D is the number of dimensions.</p>
<p>To create a presentable example, the curse of dimensionality will be simulated in 2-D. This can be achieved by creating an under-sampled (sparse) training dataset and an over-sampled validation dataset. Two of the clusters were moved closer together to make cluster separation more difficult.</p>
<p>The density can be estimated by counting the number of clusters assigned to a data point. The solution is parameterized so it can be applied to the validation dataset after training. The plot below shows the histogram of the density estimate after running the model on the training dataset.</p>
<img src="https://freedomtowin.github.io/hdf_images/output_3.png"/><br/>
<br/>
<br/>
<p>The density can be used to visualize the denser areas of the data. The green rings show the true distributions’ two standard deviation threshold. The plot below visualizes the density for the training dataset.</p>
<img src="https://freedomtowin.github.io/hdf_images/output_4.png"/><br/>
<br/>
<br/>
<p>Outliers can be defined by a percentile, i.e. 5th, 10th, etc., of the density estimate. The plot below shows the outliers defined by the 10th percentile. The green rings show the true distributions’ two standard deviation threshold</p>
<img src="https://freedomtowin.github.io/hdf_images/output_5.png"/><br/>
<br/>
<br/>
<p>The plot below shows the histogram of the density estimate for the validation dataset.</p>
<img src="https://freedomtowin.github.io/hdf_images/output_6.png"/><br/>
<br/>
<br/>
<p>The plot below visualizes the density for the validation dataset. The green rings show the true distributions’ two standard deviation threshold</p>
<img src="https://freedomtowin.github.io/hdf_images/output_7.png"/><br/>
<br/>
<br/>
<p>The outliers, defined by the 10th percentile, are visualized below. The green rings show the true distributions’ two standard deviation threshold</p>
<img src="https://freedomtowin.github.io/hdf_images/output_8.png"/><br/>
<br/>
<br/>
<h1>Conclusion</h1>
<p>This particular use case was focused on outlier detection. However, the algorithm also provided cluster assignments and density estimates for each data point. Other outlier detection methods, i.e., local outlier factor (LOF), can produce similar results in terms of outliers detection. Local outlier factor is dependent on the number of nearest neighbors and the contamination parameter. While it is easy to tune LOF’s parameter in 2-D, it is not so easy in multiple dimensions. Hierarchical density factorization provides a robust method to fit multi-variate distributions without the need for extensive hyper-parameter tuning. While the algorithm does not depend on the number of data points, it is still a relatively slow algorithm. Many improvements can be made to improve the efficiency and speed. The example notebook includes a comparison to LOF and a multivariate example using the Pokemon dataset.</p>
<p></p>
<p><em>Originally posted <a href="https://freedomtowin.github.io/2020/04/04/KernelML-Hierarchical-Density-Factorization.html" target="_blank" rel="noopener">here</a>.</em></p>
<br/>
<p></p>
<p></p>
<p></p>High Density Region Estimation with KernelMLtag:www.datasciencecentral.com,2019-01-04:6448529:BlogPost:7903772019-01-04T01:00:00.000ZRohan Kotwanihttps://www.datasciencecentral.com/profile/RohanKotwani
<p>Data scientists and predictive modelers often use 1-D and 2-D aggregate statistics for exploratory analysis, data cleaning, and feature creation. Higher dimensional aggregations, i.e., 3 dimensional and above, are more difficult to visualize and understand. High density regions are one example of these N-dimensional statistics. High density regions can be useful for summarizing common characteristics across multiple variables. Another use case is to validate a forecast prediction’s…</p>
<p>Data scientists and predictive modelers often use 1-D and 2-D aggregate statistics for exploratory analysis, data cleaning, and feature creation. Higher dimensional aggregations, i.e., 3 dimensional and above, are more difficult to visualize and understand. High density regions are one example of these N-dimensional statistics. High density regions can be useful for summarizing common characteristics across multiple variables. Another use case is to validate a forecast prediction’s plausibility by exploring the densities associated with the forecast. Other machine learning approaches such as clustering and kernel density estimation are similar to finding high density regions, but these methods are different in a few important ways. It is worth noting why these methods, while useful, are not design exactly designed for the purpose of finding these regions. The goal is to use KernelML to efficiently find the regions of highest density for an N-dimensional dataset.</p>
<p><a href="https://storage.ning.com/topology/rest/1.0/file/get/677411908?profile=original" target="_blank" rel="noopener"><img src="https://storage.ning.com/topology/rest/1.0/file/get/677411908?profile=original" class="align-center"/></a></p>
<p><em>My approach to developing this algorithm was to find a set of, common sense, constraints to construct the loss metric.</em></p>
<p>The high density region estimator, HDRE, algorithm uses N multivariate uniform distributions to cluster the data. Uniform distributions are less sensitive to outliers than normal distributions, and these distribution truncate low correlation across the vertical and horizontal axes while keeping high correlations along the diagonal axes. The clusters are constrained to shared variance across all clusters and equal variance across all dimensions. The data should be normalized to allow the clusters to scale properly across each dimension.</p>
<p>See more <a href="https://towardsdatascience.com/high-density-region-estimation-with-kernelml-2cd453192e9b" target="_blank" rel="noopener">here</a>.</p>
<p> </p>Generalized Machine Learning - Kerneml - Simple ML to train Complex MLtag:www.datasciencecentral.com,2018-05-04:6448529:BlogPost:7196992018-05-04T19:00:00.000ZRohan Kotwanihttps://www.datasciencecentral.com/profile/RohanKotwani
<p class="graf graf--p">I recently created a ‘particle optimizer’ and published a pip python package called kernelml. The motivation for making this algorithm was to give analysts and data scientists a generalized machine learning algorithm for complex loss functions and non-linear coefficients. The optimizer uses a combination of simple machine learning and probabilistic simulations to search for optimal parameters using a loss function, input and output matrices, and (optionally) a random…</p>
<p class="graf graf--p">I recently created a ‘particle optimizer’ and published a pip python package called kernelml. The motivation for making this algorithm was to give analysts and data scientists a generalized machine learning algorithm for complex loss functions and non-linear coefficients. The optimizer uses a combination of simple machine learning and probabilistic simulations to search for optimal parameters using a loss function, input and output matrices, and (optionally) a random sampler. I am currently working more features and hope to eventually make the project open source.</p>
<p><span style="font-size: 12pt;"><strong>Example use case:<a href="http://storage.ning.com/topology/rest/1.0/file/get/2808358479?profile=original" target="_self"><br/></a></strong></span></p>
<p>Lets take the problem of clustering longitude and latitude coordinates. Clustering methods such as K-means use Euclidean distances to compare observations. However, The Euclidean distances between the longitude and latitude data points do not map directly to Haversine distance. That means if you normalize the coordinate between 0 and 1, the distance won't be accurately represented in the clustering model. A possible solution is to find <span>a projection for latitude and longitude so that the Haversian distance to the centroid of the data points is equal to that of the projected latitude and longitude in Euclidean space.</span></p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808359498?profile=original" target="_self"><img width="700" src="http://storage.ning.com/topology/rest/1.0/file/get/2808359498?profile=RESIZE_1024x1024" width="700" class="align-full"/></a></p>
<p> </p>
<p>The result of this coordinate transformations allows you represent the Haversine distance, relative to the center, as Euclidean distance, which can be scaled and used in a cluster solution.</p>
<p>Another, simpler problem is to find the optimal values of non-linear coefficients, i.e, power transformations in a least squares linear model. The reason for doing this is simple: integer power transformations rarely capture the best fitting transformation. By allowing the power transformation to be any real number, the accuracy will improve and the model will generalize to the validation data better. </p>
<p><span><strong><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808358479?profile=original" target="_self"><img width="330" src="http://storage.ning.com/topology/rest/1.0/file/get/2808358479?profile=RESIZE_480x480" width="330" class="align-center"/></a></strong></span></p>
<p> To clarify what is meant by a power transformation, the formula for the model is provided above.</p>
<p><span style="font-size: 12pt;"><strong>The algorithm:</strong></span></p>
<p>The idea behind kernelml is simple. Use the parameter update history in a machine learning model to decide how to update the next parameter set. Using a machine learning model as in the backend causes a bias variance problem, specifically, the parameter updates become more biased each iteration. The problem can be solved by including a monte carlo simulation around the best recorded parameter set after each iteration.</p>
<p><span style="font-size: 12pt;"><strong>The issue of convergence:</strong></span></p>
<p>The model saves the best parameter and user-defined loss after each iteration. The model also record a history of all parameter updates. The question is how to use this data to define convergence. One possible solution is:</p>
<p> convergence = (best_parameter-np.mean(param_by_iter[-10:,:],axis=0))/(np.std(param_by_iter[-10:,:],axis=0))</p>
<p> if np.all(np.abs(convergence)<1):<br/> print('converged')<br/> break</p>
<p>The formula creates a Z-score using the last 10 parameters and the best parameter. If the Z-score for all the parameters is less than 1, then the algorithm can be said to have converged. This convergence solution works well when there is a theoretical best parameter set. This is a problem when using the algorithm for clustering. See the example below.</p>
<p style="text-align: center;"><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808359519?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808359519?profile=original" width="607" class="align-full"/></a><span style="font-size: 8pt;"><strong> Figure 1: Clustering with kernelml, 2-D multivariate normal distribution (blue), cluster solution (other colors)</strong></span></p>
<p>We won't get into the quality of the cluster solution because it is clearly not representative of the data. The cluster solution minimized the difference between a multidimensional histogram and the average probability of 6 normal distributions, 3 for each axis. Here, The distributions can 'trade' data points pretty easily which could increase convergence time. Why not just fit 3 multivariate normal distributions? There is a problem with simulating the distribution parameters because some parameters have constraints. The covariance matrix needs to be positive, semi-definite, and the inverse needs to exist. The standard deviation in a normal distribution must be >0. The solution used in this model incorporates the parameter constraints by making a custom simulation for each individual parameter. I'm still looking for a good formulation for how to efficiently simulate the covariance matrix for a multivariate normal distribution.</p>
<p><span style="font-size: 12pt;"><strong>Why use kernelml instead of Expectation Maximization?</strong></span></p>
<p>A non-normal distribution, such as Poisson, might not fit well with other dimensions in a multivariate normal cluster solution. In addition, as the number of dimensions increases, the probability that one cluster has a feature that only has non-zero values also increases. This poses a problem for the EM algorithm as it tries to update the covariation matrix. The covariance between the unique feature and other dimensions will be zero, or the probability that another cluster will accept an observation with this non-zero value is zero.</p>
<p><font size="3"><b>Probabilistic optimizer benefits:</b></font></p>
<p>The probabilistic simulation of parameters has some great benefits over fully parameterized models. First, regularization is included in the prior random simulation. For example, if prior random simulation of is between -1 and 1, it can be reasoned that the parameters will update with equal importance. In addition, while the algorithm is converging, each iteration produces a set of parameters that samples around the global, or local, minimum loss. There are two main benefits of this: 1) a confidence interval can be established for each parameter 2) the predicted output from each parameter set can be a useful feature in a unifying model.</p>
<p>The code for the clustering example, other uses cases, and documentation (still in progress) can be found <a href="https://github.com/Freedomtowin/kernelml" target="_blank" rel="noopener">in github</a>.</p>
<p></p>How signal processing can be used to identify patterns in complex time seriestag:www.datasciencecentral.com,2017-01-19:6448529:BlogPost:5128522017-01-19T03:30:00.000ZRohan Kotwanihttps://www.datasciencecentral.com/profile/RohanKotwani
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell"><div class="text_cell_render border-box-sizing rendered_html"></div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell"><div class="text_cell_render border-box-sizing rendered_html" style="text-align: left;"><p>The trend and seasonality can be accounted for in a linear model by including sinusoidal components with a given frequency. However, finding the appropriate frequency…</p>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell"><div class="text_cell_render border-box-sizing rendered_html"></div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell"><div class="text_cell_render border-box-sizing rendered_html" style="text-align: left;"><p>The trend and seasonality can be accounted for in a linear model by including sinusoidal components with a given frequency. However, finding the appropriate frequency for each sinusoidal component requires a little more digging. This post shows how to use fast Fourier transforms to find these frequencies.</p>
<p></p>
<p><strong>Defining the model:</strong></p>
<p><strong>y = P(t) + S(t) + T(t) + R(t)</strong></p>
<ul>
<li>P(t)~Polynomial component</li>
<li>S(t)~Seasonal component</li>
<li>T(t)~Trend component</li>
<li>R(t)~Residual error</li>
</ul>
<p></p>
<p>For the purposes of this post, we will only focus on the T(t) and S(t) components. The actual model fitting will be done in a separate post.</p>
<p>600 observations were used in the training set. The result was tested on the full dataset with 731 observations.</p>
<p></p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808326955?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808326955?profile=original" width="569" class="align-full"/></a></p>
<p></p>
<h2 id="Find-the-overall-trend:">Find the overall trend:</h2>
<p>I used an FFT transformation to visualize the magnitude of the frequency components in the time series. To be specific, the absolute magnitude is plotted.</p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808327054?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808327054?profile=original" width="537" class="align-full"/></a></p>
<p style="text-align: left;"><strong>Frequency Component, Magnitude</strong></p>
<pre>[ 1.41666667e-01 1.82239797e+05]<br/>[ 1.43333333e-01 5.67160341e+05]<br/>
[ 2.83333333e-01 1.66899918e+05]<br/>
[ 2.85000000e-01 4.59942544e+05]<br/>
[ 2.86666667e-01 3.95441559e+05]<br/>
[ 4.28333333e-01 2.03492985e+05]<br/>
</pre>
<h2 id="Does-it-make-sense-to-reuse-frequencies-for-the-trend-and-seasonal-components?">Does it make sense to reuse frequencies for the trend and seasonal components?<a class="anchor-link"></a></h2>
<ul>
<li>On one hand, it might be better not to miss anything. I doubt there will be a prominant trend for -a weekday- every 28 weeks.</li>
<li>For the trend component, it would makes sense to use the lowest frequencies with the highest magnitudes.</li>
<li>For the seasonal component, there are "interesting" frequencies around .143, .285, and .428.</li>
</ul>
<p></p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808327075?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808327075?profile=original" width="563" class="align-full"/></a></p>
<p></p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808327339?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808327339?profile=original" width="550" class="align-full"/></a></p>
<h2 id="Finding-seasonal-patterns:">Finding seasonal patterns in the target variable:</h2>
<h3 id="The-overall-trend-could-be-removed-by-creating-a-differenced-variable-for-Pageviews-The-differenced-variable-allows-for-seasonal-components-to-be-identified-more-clearly.">The overall trend could be removed by creating a differenced variable for Pageviews The differenced variable allows for seasonal components to be identified more clearly.</h3>
<p></p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808327592?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808327592?profile=original" width="580"/></a></p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808327729?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808327729?profile=original" width="537"/></a></p>
<p></p>
<div class="cell border-box-sizing code_cell rendered"><div class="output_wrapper"><div class="output"><div class="output_area"><div class="output_subarea output_stream output_stdout output_text"><pre><strong>Frequency Component, Magnitude</strong><br/>[ 1.43333333e-01 5.00831933e+05] <br/>
[ 2.83333333e-01 2.65832489e+05] <br/>
[ 2.85000000e-01 7.24904464e+05] <br/>
[ 2.86666667e-01 6.13035227e+05] <br/>
[ 2.88333333e-01 1.92922452e+05] <br/>
[ 4.28333333e-01 4.04206565e+05]</pre>
</div>
</div>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt"></div>
<div class="inner_cell"><div class="text_cell_render border-box-sizing rendered_html"><p><span>The lower frequency components were removed and the other, distinct frequencies were amplified. This makes the frequencies easier to filter! Also it makes it easier to compare to possible seasonal variables.</span></p>
<p><span> </span></p>
</div>
</div>
</div>
<h2 id="Finding-seasonal-patterns:">Finding the seasonal predictor variable:</h2>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2808329263?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2808329263?profile=original" width="523" class="align-full"/></a></p>
<p></p>
<pre><strong>Frequency Component, Magnitude<br/></strong></pre>
<pre>[ 1.41666667e-01 2.42782136e+02] <br/>[ 1.43333333e-01 6.00386477e+02] <br/>
[ 1.45000000e-01 1.31981640e+02] <br/>
[ 2.85000000e-01 2.78344410e+02] <br/>
[ 2.86666667e-01 2.07887576e+02] <br/>
[ 4.28333333e-01 2.97539156e+02]</pre>
<p></p>
<h2 id="Eureka!-Weekday-shares-the-same-frequency-components-as-Pageviews!"><strong>Eureka! Weekday shares the same frequency components as Pageviews!</strong><a class="anchor-link"></a></h2>
<p><span style="font-size: 1.17em;"><span>I found dominant frequencies at .143, .285, and .428. These correspond to T=7.14,3.5, and 2.33. There were also some frequencies around the e-3 orders of magnitude. These were at .00166, .00333, and 0.005 and had periods upwards of 200. </span></span></p>
<p></p>
<p><strong><span style="font-size: 1.17em;">If you want to see how I included these frequency components in a regression model please see my Github. The results are compared to straight up dummy coding (the results are the same). </span></strong></p>
<p></p>
<h3><span style="font-size: 1.17em;" class="font-size-3"><a href="https://github.com/Freedomtowin/DSC-Complex-Time-Series-Challenge/blob/master/DSC_Challenge_Complex_Time_Series.ipynb%C2%A0">https://github.com/Freedomtowin/DSC-Complex-Time-Series-Challenge/blob/master/DSC_Challenge_Complex_Time_Series.ipynb </a>;</span></h3>
</div>
</div>
</div>