Ecommerce sites generate tons of web server log data which can provide valuable insights through analysis. For example, if we know which users are more likely to buy a product, we can perform targeted marketing, improve relevant product placement on our site and lift conversion rates. However, raw web logs are often enormous and messy so preparing the data to train a predictive model is time consuming for data scientists.…
Added by Ayumi Owada on July 18, 2019 at 2:00pm — No Comments
I built a scenario for a hybrid machine learning infrastructure leveraging Apache Kafka as scalable central nervous system. The public cloud is used for training analytic models at extreme scale (e.g. using TensorFlow and TPUs on Google Cloud Platform (GCP) via Google ML Engine. The predictions (i.e.…Continue
I had a new talk presented at "Codemotion Amsterdam 2018" this week. I discussed the relation of Apache Kafka and Machine Learning to build a Machine Learning infrastructure for extreme scale.
Long version of the title:
"Deep Learning at Extreme Scale (in the Cloud) with the Apache Kafka Open Source Ecosystem - How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc."
As always, I want to share the slide deck. The talk was…Continue
Added by Kai Waehner on May 8, 2018 at 9:30pm — No Comments
Previously, we discussed the role of Amazon Redshift's sort keys and compared how both compound and interleaved keys work in theory. Throughout that post we used some dummy data and a set of Postgres queries in order to explore the…Continue
Added by sasha blumenfeld on August 28, 2015 at 7:20am — No Comments
Added by Phil Simon on February 21, 2013 at 2:30pm — No Comments
My publisher (John Wiley & Sons) is allowing me to make the 28-page Introduction of Too Big to Ignore available for free download. The Introduction provides the structure for the rest of the book and details a few interesting ways in which organizations are utilizing Big Data.…Continue
Added by Phil Simon on January 2, 2013 at 2:30pm — No Comments