Cluster Analysis is a common technique to group a set of objects in the way that the objects in the same group share certain attributes. It’s commonly used in marketing and sales planning to define market segmentations.
Added by Yuanjen Chen on October 2, 2015 at 1:21pm — No Comments
Ashley Madison, IRS, Target, Sony…What do they have in common? Here we only name a few but of the most tremendous crisis of data breach in recent years - yes, it is happening and it is happening everywhere. The cost of data breach comes to a new high at $154 per record of stolen or leaked…Continue
The idea of "big data" could conjure up Turing-like images of massive cloud storage centers humming away in a dessert. It's true that up to this point, computing power has been a barrier to analytics, research, and the exploitation of big data. It takes a lot of computing power to process increasingly…Continue
Added by Yuanjen Chen on July 17, 2015 at 11:30am — No Comments
The 3Vs model is the foundation of big data - Volume, Velocity, and Variety. It is used to express the key features of big data problems - for me, this is about to change. Big Data is not just about size, speed, or formats, the contextual enrichment is the most critical factor of how we unmask the best value out of data. How well you bring seemingly unrelated data together and identify the valuable connections determines how much power you unleash from your…Continue
Added by Yuanjen Chen on May 13, 2015 at 1:00pm — No Comments
At the CES 2015, I was fascinated by all sorts of possible applications of IoT – socks with sensors, mattresses with sensors, smart watches, smart everything – it seems like a scene in sci-fi movies has just come true. People are eager to learn more about what’s happening around them and now they…Continue
Added by Yuanjen Chen on February 4, 2015 at 2:00pm — No Comments
Everything is connected, through the cloud all machine-generated data are collected and widely shared over the Internet. That’s how we imagine IoT – the Internet of Things.
Correction: That’s how THEY imagine IoT. What WE envision here is not just about the…Continue
In-memory database technology is fashionable in recent years as the price of RAM drops substantially and gigabyte chips become affordable. By taking advantage of the cost-performance value of RAM, leading edge database developers are boosting the performance of next-generation databases with in-memory technology. However, many developers who intend to adopt in-memory technology only think of speed in terms of RAM, and do not exploit the true power of in-memory technology.
Data science might be one of the hottest buzzwords in 2013. But is it only a marketing gimmick? I don’t think so. In my opinion, data science can be the best protocol that reveals what’s happening every day in the real world.
The data science…Continue
Added by Yuanjen Chen on February 6, 2014 at 6:30pm — No Comments
Much more devices held, much more messages sent, much more data up in the air.
Upon the arrival of IOT (Internet of Things) era, number of connected devices grows rapidly, the signals of which stack up mines containing valuable, hidden insights.
We used to analyze log files for risk management, spotting the anomalies and exceptions based on outlined records. Via the ever-richer meta data and context, we are entitled to weave more story from the strings now. By union or…Continue
Added by Yuanjen Chen on January 25, 2014 at 8:00am — No Comments
As we've seen how in-place and in-memory work differently, today we are sharing more fundamentals of in-place computing model. This models was designed to solve "Big and Complex Data," - not just about size but more about the complexity. We see many analytic cases today incorporate…Continue
The BigObject® - A Computing Engine Designed for Big Data
BigObject® presents an in-place* computing approach, designed to solve the complexity of big data and compute on a real-time basis. The mission of the BigObject® is to deliver affordable computing power, enabling enterprises of all scales to interpret big data. With the advances in what a commodity machine can perform, it…Continue
Added by Yuanjen Chen on November 20, 2013 at 5:29pm — No Comments
We have been using tables in the relational database, mostly for the transactional purposes, and that proves effective. Considering the data size and analytic purpose, however, the data structure might need to be redesigned for better efficiency.
To determine how to decompose the complexity of big data, we have observed the way the organisms function. In the physical world, the universe is organized into a hierarchy of…Continue
Added by Yuanjen Chen on November 3, 2013 at 10:29pm — No Comments
In general, computer scientists treats code and data in two very different ways. Virtual memory was originally developed to run big programs (code) in small memory, while data are entities kept in external storage and must be retrieved into memory before computing. As a result, today’s application developers think by instinct the programming model based on storage and explicit data retrieval. This model, referred to as storage-based computing, plays an important role and has done a great job…Continue
Added by Yuanjen Chen on October 31, 2013 at 7:24pm — No Comments
To be short, in-memory computing takes advantage of physical memory, which is expected to process data much faster than disk. In-place, on the other hand, fully utilizes the address space of 64bit architecture. Both are gifts from the modern computer science; both are essences of the BigObject.
In-place computing only becomes possible upon the introduction of 64bit architecture, whose address space is big enough to hold the entire data set for most of cases we are dealing with today.…Continue
This is my first post here. I'm glad to introduce this newly launched big data analytic engine, the BigObject. In the past 2 years we have been working on an optimal approach to handle big data for analytic purposes and challenging the existed models, some assumptions of which are no longer valid. For example, as the data size grows so rapidly, is it still practical that we stick to the relational models neglecting the time spending in data retrievals? What impact did…Continue