EMC Greenplum Modular Data Computing Appliance puts SQL and Hadoop in the same box, but is it a truly cohesive platform?
On the one hand there's structured data that fits neatly into the columns and rows of relational databases. That data has been mastered by relational databases, and even when it gets big (meaning north of about 10 terabytes), there are options such as massively parallel processing supported by products such as EMC's Greenplum database.
On the other hand there's the array of semi-structured, unstructured, and inconsistent data types like server log files, sensor data, social-network comments, and other forms of text-centric information. For that world the Hadoop open-source project has emerged as the leading platform for making such information computable. (Hadoop also handles highly structured data, but mostly as a high-capacity, low-cost data store.)