PostgreSQL is a commonly used database language for creating and managing large amounts of data effectively. Here, you will see how to: 1) create a PostgreSQL database us...
Let’s face it – cleaning data is a waste of time. If only the data had been collected and entered carefully in the first place, you wouldn’t be faced with days of d...
The digital universe is expanding. Not just the data collected, but also the devices that generate that data. It is estimated there will be over 20x connected devices per...
Few months ago, I wrote an article about the influencers in big data. The article resonated with many and almost all appreciated it. But that’s not the point. Soon afte...
Businesses across the globe are facing the brunt, one of huge data influx and second of increasing data complexity and of course the market volatility. To address these c...
One of the best ways to learn about any topic is start with very fundamental questions like What, Why etc? Good old Socratic method. In this series of articles on data mi...
Overview Datameer, an end-to-end big data analytics platform, is built on Apache Hadoop to perform integration, analysis, and visualization of massive volumes of both str...
Graphs belong to the field of mathematics, graph theory. For data analysis that requires searches of particular patterns, graph-based data mining becomes an important tec...
My earlier article on ‘25 Big Data terms you must know to impress your date’ had a pretty decent response (at least by my standards) and there were requests to add m...
Overview Dataiku Data Science Studio (DSS), a complete data science software platform, is used to explore, prototype, build, and deliver data products. It significantly r...