Subscribe to DSC Newsletter

Antonio Cachuan's Blog (3)

Starting to develop in PySpark with Jupyter installed in a Big Data Cluster

Is not a secret that Data Science tools like Jupyter, Apache Zeppelin or the more recently launched …

Continue

Added by Antonio Cachuan on November 20, 2018 at 8:01pm — No Comments

Big Data as a Service, get easily running a Cloudera Quickstart Image with Dockers in GCP

It’s not a secret that containers technology (popularly known as dockers) is becoming one of the top choices in software projects [1], but What about data projects/clusters? Many companies and projects have intentions to take advantages of it. Some examples are Cloudera [2] and the apache-spark-on-k8s project [3], personally, I suggest if you want more information as what exactly is called “Big Data as a Service” to check the last Strata Data…

Continue

Added by Antonio Cachuan on October 28, 2018 at 4:59pm — No Comments

Setting up your first Kafka development environment in Google Cloud in 15 minutes

Here I am writing my first post, I posponed it for a long time… In this article I would like to share my experience installing and testing basic Apache Kafka features. If you are new in the Big Data ecosystem let me give you some short concepts.

Kafka is a distributed streaming platform which means is intended for publish and subscribe to streams of records, similar to a message queue or…

Continue

Added by Antonio Cachuan on September 28, 2018 at 8:15pm — No Comments

Videos

  • Add Videos
  • View All

© 2020   TechTarget ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service