Subscribe to DSC Newsletter
Jonathan Symonds
  • Male
  • Menlo Park, CA
  • United States
Share on Facebook
Share

Gifts Received

Gift

Jonathan Symonds has not received any gifts yet

Give a Gift

 

Jonathan Symonds's Page

Latest Activity

Jonathan Symonds posted a blog post

Breaking the HDFS Performance Barrier; An Object Storage First

By Siddartha ManiFew would argue with the statement that Hadoop HDFS is in decline. In fact, the HDFS part of the Hadoop ecosystem is in more than just decline - it is in freefall. At the time of its inception, it had a meaningful role to play as a high-throughput, fault-tolerant distributed file system. The secret sauce was data locality. By co-locating compute and data on the same nodes, HDFS overcame the limitations of slow network access to data. The implications, however, are well known at…See More
Aug 11
Jonathan Symonds's blog post was featured

Breaking the HDFS Performance Barrier; An Object Storage First

By Siddartha ManiFew would argue with the statement that Hadoop HDFS is in decline. In fact, the HDFS part of the Hadoop ecosystem is in more than just decline - it is in freefall. At the time of its inception, it had a meaningful role to play as a high-throughput, fault-tolerant distributed file system. The secret sauce was data locality. By co-locating compute and data on the same nodes, HDFS overcame the limitations of slow network access to data. The implications, however, are well known at…See More
Aug 11
Jonathan Symonds posted a blog post

Running Peta-Scale Spark Jobs on Object Storage Using S3 Select

When one looks at the amazing roster of talks for most data science conferences what you don’t see is a lot of discussion on how to leverage object storage. On some level you would expect to — ultimately if you want to run your Spark or Presto job on peta-scale data sets and have it be available to your applications in the public or private cloud — this would be the logical storage architecture.While logical, there has been a catch, at least historically, and that is object storage wasn’t…See More
Jun 25
Jonathan Symonds posted a blog post

Running Peta-Scale Spark Jobs on Object Storage Using S3 Select

When one looks at the amazing roster of talks for most data science conferences what you don’t see is a lot of discussion on how to leverage object storage. On some level you would expect to — ultimately if you want to run your Spark or Presto job on peta-scale data sets and have it be available to your applications in the public or private cloud — this would be the logical storage architecture.While logical, there has been a catch, at least historically, and that is object storage wasn’t…See More
Jun 23
Jonathan Symonds's blog post was featured

Running Peta-Scale Spark Jobs on Object Storage Using S3 Select

When one looks at the amazing roster of talks for most data science conferences what you don’t see is a lot of discussion on how to leverage object storage. On some level you would expect to — ultimately if you want to run your Spark or Presto job on peta-scale data sets and have it be available to your applications in the public or private cloud — this would be the logical storage architecture.While logical, there has been a catch, at least historically, and that is object storage wasn’t…See More
Jun 23

Profile Information

Short Bio
Corporate marketing at Ayasdi
My Web Site Or LinkedIn Profile
http://ayasdi.com
Professional Status
Manager
Interests:
Other

Jonathan Symonds's Blog

Breaking the HDFS Performance Barrier; An Object Storage First

Posted on August 6, 2019 at 1:00pm 0 Comments

By Siddartha Mani

Few would argue with the statement that Hadoop HDFS is in decline. In fact, the HDFS part of the Hadoop ecosystem is in more than just decline - it is in freefall. At the time of its inception, it had a meaningful role to play as a high-throughput, fault-tolerant distributed file system. The secret sauce was data locality. 

By co-locating compute and data on the same nodes, HDFS overcame the limitations of slow network access to data. The…

Continue

Running Peta-Scale Spark Jobs on Object Storage Using S3 Select

Posted on June 25, 2019 at 9:00am 0 Comments

When one looks at the amazing roster of talks for most data science conferences what you don’t see is a lot of discussion on how to leverage object storage. On some level you would expect to — ultimately if you want to run your Spark or Presto job on peta-scale data sets and have it be available to your applications in the public or private cloud — this would be the logical storage architecture.

While logical, there has been a catch, at least historically, and that is object storage…

Continue

Relationships, Geometry, and Artificial Intelligence

Posted on December 4, 2018 at 3:00pm 0 Comments

By Gunnar Carlsson

December 3, 2018

In their very provocative paperPeter Battaglia and his colleagues, posit that in order for artificial intelligence (AI) to achieve the capabilities of human intelligence, it must be…

Continue

Using unsupervised learning to improve prediction performance

Posted on November 20, 2018 at 1:00pm 0 Comments

By Gunnar Carlsson

The appeal of forecasting the future is very easy to understand, even though it is not realizable.  That has not stopped an entire generation of analytics companies from selling such a promise. It also explains the myriad methods that attempt to give partial, inexact, and probabilistic information about the future.

Even if they could deliver on a…

Continue

Comment Wall

You need to be a member of Data Science Central to add comments!

Join Data Science Central

  • No comments yet!
 
 
 

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service