Subscribe to DSC Newsletter

BigQuery Big Data Visualization With D3.js

How to handle large dataset with D3.js?

It’s a frequently asked question. You can read several discussions on the topic here,here, and here. So far, the best solution is to process data to a smaller dataset. Then use D3.js to visualize.

With carefully crafted data processing, we can get decent story from data. But this solution doesn’t provide a lot of flexibility to experiment with data on the fly. We need a more streamlined workflow. Less friction can spark interesting data innovation.

Google BigQuery a great tool to handle big dataset. It’s definitely going to help us handle big dataset for D3.js.

I will use New York Taxi dataset hosted on Google BigQuery. It is 4+ GB and has more than 350 million rows in 2 tables. In this article, I want to show you how to query it on the fly. Then use D3.js to create a line chart of total trip amount over time. You can explore the dataset here:

(You’ll need to setup BigQuery account with one project to see public table)

BigQuery has full SQL support. So we can run aggregate query directly on dataset. We’ll group by month/year and sum total_amount column. It takes less than 5 seconds.

SELECT CONCAT(CONCAT(STRING(MONTH(TIMESTAMP(pickup_datetime))), "/"), STRING(YEAR(TIMESTAMP(pickup_datetime)))) AS time, SUM(INTEGER(total_amount)) AS total_amount FROM [833682135931:nyctaxi.trip_fare] GROUP BY time;

Then you can use Javascript client library to query for your dataset on the fly. See full article and code here:

Finally, you'll have a visualization that gets data directly from Google BigQuery.

Phuoc Do

Views: 4744

Tags: analytics, big, bigquery, data, javascript, visualization


You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service