Data has become a cargo cult. Collect more data, calculate more metrics, hire more analysts, let them figure out what this is all for – and you’re considered to be data driven. I've had it up to here while consulting startups over the past three years and helping them to define business… Continue
Added by Luba Belokon on June 8, 2018 at 1:00am —
A single query optimization tip can boost your database performance by 100x. At one point, we advised one of our customers that had a 10TB database to use a date-based multi-column index. As a result, their date range query sped up… Continue
Added by Luba Belokon on May 17, 2018 at 2:30am —
After reviewing 8 great ETL tools for fast-growing startups, we got a request to tell you more about open source solutions.There are many open source ETL… Continue
Added by Luba Belokon on April 26, 2018 at 2:30am —
When our customers ask us what the best data warehouse is for their growing company, we consider the answer based on their specific needs. Usually, they need nearly real-time data for a low price without the need to maintain data warehouse…
Added by Luba Belokon on April 19, 2018 at 5:30am —
ETL stands for Extract, Transform, Load. It has been a traditional way to manage analytics pipelines for decades. With the advent of modern cloud-based data warehouses, such as BigQuery or Redshift, the traditional concept of ETL is changing towards ELT – when you’re running transformations right in the data warehouse. Let’s see why it’s happening, what it means to have ETL vs ELT, and what we can expect in the future.
ETL is hard and outdated
ETL arose to solve a problem of… Continue
Added by Luba Belokon on April 6, 2018 at 7:30am —
One of the first things we do after launching a website nowadays is connect to Google Analytics. A little bit down the road we’ll connect more “out-of-box” analytics tools to calculate funnels, retention, A/B tests, and more.
These tools are great and work fine until a company gets bigger and analytics requirements get more sophisticated. It’s time to set up a data infrastructure, which means selecting a data collection tool, ETL tool, data warehouse, and BI tool on top of… Continue
Added by Luba Belokon on March 30, 2018 at 3:30am —
Many of “out-of-the-box” analytics solutions come with automatically defined user sessions. It’s good to start with, but as your company grows, you’ll want to have your own session definitions based on your event data. Analyzing user sessions with SQL gives you flexibility and full control over how metrics are defined for your unique business.
What is a session and why should I care?
The session is usually defined as a group…
Added by Luba Belokon on March 27, 2018 at 1:00am —
Quite recently we’ve built event analytics for our team and thought to share this experience with you in this post .
Many of “out-of-the-box” analytics solutions come with automatically defined user sessions. It’s good to start with, but as your company grows, you’ll want to have your own session definitions based on your event data. Analyzing user… Continue
Added by Luba Belokon on February 8, 2018 at 7:30am —
The Statsbot team estimated LTV 592 times for different clients and business models.
Customer lifetime value, or LTV, is the amount of money that a customer will spend with your business in their “lifetime,” or at least, in the portion of it that they spend in a relationship with you. It’s an important indicator of how much you can spend on acquiring new customers. For example, your customer acquisition cost (CAC) is $150, and LTV is… Continue
Added by Luba Belokon on February 1, 2018 at 8:30am —
When I was beginning my way in data science, I often faced the problem of choosing the most appropriate algorithm for my specific problem. If you’re like me, when you open some article about machine learning algorithms, you see dozens of detailed descriptions. The paradox is that they don’t ease the choice.
In this article, I will try to explain basic concepts and give some intuition of using different… Continue
Added by Luba Belokon on October 26, 2017 at 6:00am —
In recent years, the field of object detection has seen tremendous progress, aided by the advent of deep learning. Object detection is the task of identifying objects in an image and drawing bounding boxes around them, i.e. localizing them. It’s a very important problem in computer vision due its numerous applications from self-driving cars to security and tracking.
Prior approaches of object detection… Continue
Added by Luba Belokon on October 19, 2017 at 8:30am —
Bayesian Nonparametrics is a class of models with a potentially infinite number of parameters. High flexibility and expressive power of this approach enables better data modelling compared to parametric methods.
Bayesian Nonparametrics is used in problems where a dimension of interest grows with data, for example, in problems where the number of features is not fixed but allowed to vary as we observe more… Continue
Added by Luba Belokon on October 12, 2017 at 3:00pm —
Machine learning is getting more and more popular in applications and software products, from accounting to hot dog recognition apps. When you add machine learning techniques to exciting projects, you need to be ready for a number of difficulties. The Statsbot team asked Boris Tvaroska to tell us how to prepare a DevOps pipeline for an ML… Continue
Added by Luba Belokon on October 4, 2017 at 7:30am —
Generative adversarial networks (GANs) are a class of neural networks that are used in unsupervised machine learning. They help to solve such tasks as image generation from descriptions, getting high resolution images from low resolution ones, predicting which drug…
Added by Luba Belokon on August 17, 2017 at 6:30am —
Years ago, it was very time consuming to translate the text from an unknown language. Using simple vocabularies with word-for-word translation was hard for two reasons: 1) the reader had to know the grammar rules and 2) needed to keep in mind all language versions while translating the whole sentence.
Added by Luba Belokon on August 1, 2017 at 5:00am —
Today, many companies use big data to make super relevant recommendations and growth revenue. Among a variety of recommendation algorithms, data scientists need to choose the best one according a business’s limitations and requirements.
To simplify this task, my team has prepared an overview of the main existing recommendation system… Continue
Added by Luba Belokon on July 28, 2017 at 4:00am —