Summary: This blog details R data.table programming to handle multi-gigabyte data. It shows how the data can be efficiently loaded, “normalized”, and counted....
One of the main challenges in data science projects is managing stakeholder expectations. Often those in the business will have little idea of the complexity and timescal...
As you might realize by now, writing SQL queries is one of the essential skills any inspiring data analyst needs to master. After all, larger datasets are typically store...
The history of Database management systems could be interpreted as a Darwinian evolution process. The dominance of relational databases gives way to the data warehouses o...
In a recent blog I stated that “Crossing the AI Chasm” is primarily an organizational and cultural challenge, not a technology challenge. That “Crossing the AI ...
This article was written by Bob Hayes. Data science requires the effective application of skills in a variety of machine learning areas and techniques. A recent survey by...
Ah, it’s that time of year when everyone is making predictions about next year and extrapolating from the previous year’s trends to create logical, pragmatic predicti...
AI Robotization with IRIS Data Platform Author: Sergey Lukyanchikov Fixing the terminology A robot is not expected to be either huge or humanoid, or even material (in dis...
Both R and Python-Pandas are array-oriented platforms that support fast filtering through vectors of record-id’s. In Python-Pandas, such vectors are implemented via...
This article is by Jorge Castañón, Ph.D., Senior Data Scientist at the IBM Machine Learning Hub. Data visualization plays two key roles: 1. Communicating results clea...