Subscribe to DSC Newsletter

All Blog Posts Tagged 'ETL' (17)

Streamlining Predictive Modeling Workflow with Sagemaker and Essentia

Ecommerce sites generate tons of web server log data which can provide valuable insights through analysis. For example, if we know which users are more likely to buy a product, we can perform targeted marketing, improve relevant product placement on our site and lift conversion rates. However, raw web logs are often enormous and messy so preparing the data to train a predictive model is time consuming for data scientists.…

Continue

Added by Ayumi Owada on July 18, 2019 at 2:00pm — No Comments

Big Data Architecture in Data Processing and Data Access

I started my career as an Oracle database developer and administrator back in 1998. Over the past 20+ years, it has been amazing to see how IT has been evolving to handle the ever growing amount of data, via technologies including relational OLTP (Online Transactional Processing) database, data warehouse, ETL (Extraction, Transformation and Loading) and OLAP (Online Analytical Processing) reporting, big data and now AI, Cloud and IoT.  All these technologies were enabled by the rapid…
Continue

Added by Stephanie Shen on June 23, 2019 at 7:30am — No Comments

Maximizing Sales with Market Basket Analysis

Sales data analyses can provide a wealth of insights for any business but rarely is it made available to the public. In 2018, however, a retail chain provided Black Friday sales data on Kaggle as part of a Kaggle competition. Although the store and product lines are…

Continue

Added by Ayumi Owada on April 17, 2019 at 6:30am — No Comments

Analyzing CDC Health Survey

Continue

Added by Benjamin Waxer on March 4, 2019 at 12:00am — No Comments

Open Source ETL: Apache NiFi vs Streamsets

After reviewing 8 great ETL tools for fast-growing startups, we got a request to tell you more about open source solutions.There are many open source ETL tools and frameworks, but most of them require writing code.…

Continue

Added by Luba Belokon on April 26, 2018 at 2:30am — No Comments

5 Keys to Real-Time Analytics

The value analytics brings to a business is inversely related to the time it takes to create said analysis. In a traditional world of quarterly lookbacks, an analyst’s output may be interesting, but its ability to drive real relevant change is hindered by time and effort. The fundamentals that were once present may have all changed.

This is why real-time analytics are a breakthrough for a business. If you can take…

Continue

Added by Taylor Barstow on April 23, 2018 at 3:00am — 1 Comment

Closed Computational System Leads to Bloated Databases

Not a few big organizations find their databases (or data warehouses) crammed with a huge number of old data tables, sometimes tens of thousands of them, after many years of operation. People have already forgotten why they are created; these tables even have long been useless. But all are kept for fear of mistaken deletion, causing heavy operation and maintenance workload. Moreover, a large number of stored procedures feed data continuously to these tables, seriously consuming the…

Continue

Added by JIANG Buxing on November 15, 2017 at 1:00am — No Comments

If Data is as Valuable as Gold, It’s Time to Polish Your Data Architecture

It speaks volumes of the world we live in today when headlines such as “The world’s most valuable resource is no longer oil, but data” and “Why Data May Be More Valuable Than Dollars” are commonplace. With the explosion of IoT and with that 2.5 quintillion bytes of data being created per day, the underlying power of this data comes as no surprise.

Unlike gold however, data is ubiquitous and being created at an exponential rate. So where’s the value in something that is everywhere?…

Continue

Added by Amy Flippant on June 5, 2017 at 12:30am — No Comments

Data Virtualization: A Supermarket for Data

What is data virtualization? Here’s an analogy using a concept that we can all relate to: a supermarket.

Picture the scene: Shopping list in one hand, shopping basket in the other, you’re ready to tackle your weekly shopping in your local supermarket. Your items range from fruit and vegetables to washing detergent, perhaps with some free-range eggs thrown in for good measure. Quite the eclectic mix, but you know that you’ll be able to find all you need under one…

Continue

Added by Amy Flippant on March 9, 2017 at 12:30am — 1 Comment

Data Integration Tools – Market Study

This post is a brief review of leading Data Integration tools in the market. Heavily referencing from the Gartner 2016 report and peer reviews from my circle.

 

The Market

The data integration tool market was worth approximately $2.8 billion at the end of 2015, an increase of 10.5% from the end of 2014 [2016 Gartner…

Continue

Added by Kashif Saiyed on October 21, 2016 at 7:30pm — No Comments

Apache Beam - Create Data Processing Pipelines

At the Data Science Association our members often complain about the major data engineering problem of finding the right tools and programming models to build both robust data processing pipelines and efficient ETL processes for data transformation and integration.…



Continue

Added by Michael Walker on May 19, 2016 at 10:00pm — No Comments

ETL as A New Data Fusion Paradigm

Finding insight within one data stream is a challenge. Finding insight from multiple streams can be significantly more so.  The simple example? Two different databases created independently of each other that claim to capture the same kind of data.  The larger the dataset, the more challenges we face aligning columns, de-duping content, making sure we don’t overwrite newer data with old data, and otherwise cleaning and preparing data for analysis. Ask anyone who has worked trying to align…

Continue

Added by Anne Russell on March 30, 2015 at 4:00pm — No Comments

Data Warehouse Architecture

According to Weisensee et al., Data warehouse architecture follows following principles:

  • Data Sources
  • Data Warehouses
  • Data Marts
  • Publication Services

Extraction, Transformation and Loading (ETL):

ETL process is the foundation of BI. Success and failure of BI projects depends upon ETL process. It plays a vital role to integrate and enhance the worth of data. After the extraction, cleansing and arrangement…

Continue

Added by Avesh Dhakal on May 20, 2014 at 12:30am — No Comments

Big Data and 2014

We are witnessing a paradigm shift in Data Environment. In recent years, Big Data has risen on the technology horizons and is under the aspect of efficient and cost effective management and analysis of vast amounts of data for both public and private organizations. There are several organizations, which are trying to harness this continuing data stream, and in 2014, several of these organizations will go about making this data available in real time .

Any organization, that want to…

Continue

Added by Atif Farid Mohammad on December 8, 2013 at 10:05am — No Comments

Big Data: An Understanding

We establish understanding of things in terms of Data or it will be better to say in terms of Big Data, the utilization of things, matters, issues, inventions, surroundings, maps and much more throughout our everyday life cycle, all of which has a certain data type to get input, process and output for us. Sometime we understand these in almost no time as a human, where data is being originated, what are we targeting for and more, and there are times, when some thing might take longer…

Continue

Added by Atif Farid Mohammad on November 29, 2013 at 12:50am — No Comments

Tableau and Data wrangler have a child

Hi - we'd love to get your feedback on a new product oinoi we're building.



We do analytics work for mobile carriers in Africa. Our work consists in building advanced dashboards. We do it with Tableau and we love the tool. However, building nice visualizations requires a long & tedious work of getting the data into shape (merge data sources, clean, aggregate, clean, format, etc). We haven't found yet a tool to make…

Continue

Added by Antoine Bruyns on September 5, 2013 at 9:34am — 3 Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service