Subscribe to DSC Newsletter

Yuanjen Chen's Blog (15)

Who are alike? Use BigObject feature vector to find similarities

Cluster Analysis is a common technique to group a set of objects in the way that the objects in the same group share certain attributes. It’s commonly used in marketing and sales planning to define market segmentations.



Here at…

Continue

Added by Yuanjen Chen on October 2, 2015 at 1:21pm — No Comments

The Dark Side of Big Data

Ashley Madison, IRS, Target, Sony…What do they have in common? Here we only name a few but of the most tremendous crisis of data breach in recent years - yes, it is happening and it is happening everywhere. The cost of data breach comes to a new high at $154 per record of stolen or leaked…

Continue

Added by Yuanjen Chen on September 1, 2015 at 9:00pm — 1 Comment

Big Data Analytics in a Snap

The idea of "big data" could conjure up Turing-like images of massive cloud storage centers humming away in a dessert. It's true that up to this point, computing power has been a barrier to analytics, research, and the exploitation of big data. It takes a lot of computing power to process increasingly…

Continue

Added by Yuanjen Chen on July 17, 2015 at 11:30am — No Comments

How to Cross Link Data and Why You Should Do So

The 3Vs model is the foundation of big data - Volume, Velocity, and Variety. It is used to express the key features of big data problems - for me, this is about to change. Big Data is not just about size, speed, or formats, the contextual enrichment is the most critical factor of how we unmask the best value out of data. How well you bring seemingly unrelated data together and identify the valuable connections determines how much power you unleash from your…

Continue

Added by Yuanjen Chen on May 13, 2015 at 1:00pm — No Comments

Big Data, IoT, Wearables: A Connected World with Intelligence

At the CES 2015, I was fascinated by all sorts of possible applications of IoT – socks with sensors, mattresses with sensors, smart watches, smart everything – it seems like a scene in sci-fi movies has just come true. People are eager to learn more about what’s happening around them and now they…

Continue

Added by Yuanjen Chen on February 4, 2015 at 2:00pm — No Comments

Internet of Things? Maybe. Maybe Not.

Everything is connected, through the cloud all machine-generated data are collected and widely shared over the Internet. That’s how we imagine IoT – the Internet of Things.

 

Correction: That’s how THEY imagine IoT. What WE envision here is not just about the…

Continue

Added by Yuanjen Chen on June 17, 2014 at 8:30pm — 17 Comments

Three Myths About Today’s In-Memory Databases

In-memory database technology is fashionable in recent years as the price of RAM drops substantially and gigabyte chips become affordable. By taking advantage of the cost-performance value of RAM, leading edge database developers are boosting the performance of next-generation databases with in-memory technology. However, many developers who intend to adopt in-memory technology only think of speed in terms of RAM, and do not exploit the true power of in-memory technology.

The…

Continue

Added by Yuanjen Chen on March 9, 2014 at 10:00pm — 3 Comments

Bring Data Science to Everyday Life

Data science might be one of the hottest buzzwords in 2013. But is it only a marketing gimmick? I don’t think so. In my opinion, data science can be the best protocol that reveals what’s happening every day in the real world.

The data science…

Continue

Added by Yuanjen Chen on February 6, 2014 at 6:30pm — No Comments

Not just analyze the log, mine it.

Much more devices held, much more messages sent, much more data up in the air.

Upon the arrival of IOT (Internet of Things) era, number of connected devices grows rapidly, the signals of which stack up mines containing valuable, hidden insights.

We used to analyze log files for risk management, spotting the anomalies and exceptions based on outlined records. Via the ever-richer meta data and context, we are entitled to weave more story from the strings now. By union or…

Continue

Added by Yuanjen Chen on January 25, 2014 at 8:00am — No Comments

In-place Computing Model: for Big and Complex Data

As we've seen how in-place and in-memory work differently, today we are sharing more fundamentals of in-place computing model. This models was designed to solve "Big and Complex Data," - not just about size but more about the complexity. We see many analytic cases today incorporate…

Continue

Added by Yuanjen Chen on January 13, 2014 at 12:30am — 2 Comments

Introduction to the BigObject® and In-place Computing Model

The BigObject® - A  Computing Engine Designed for Big Data

BigObject® presents an in-place* computing approach, designed to solve the complexity of big data and compute on a real-time basis. The mission of the BigObject® is to deliver affordable computing power, enabling enterprises of all scales to interpret big data. With the advances in what a commodity machine can perform, it…

Continue

Added by Yuanjen Chen on November 20, 2013 at 5:29pm — No Comments

The assumptions on which the RDBMS is based has changed: the ideal data structure

We have been using tables in the relational database, mostly for the transactional purposes, and that proves effective. Considering the data size and analytic purpose, however, the data structure might need to be redesigned for better efficiency.

To determine how to decompose the complexity of big data, we have observed the way the organisms function. In the physical world, the universe is organized into a hierarchy of…

Continue

Added by Yuanjen Chen on November 3, 2013 at 10:29pm — No Comments

The assumptions on which the RDBMS is based has changed: data and code

In general, computer scientists treats code and data in two very different ways. Virtual memory was originally developed to run big programs (code) in small memory, while data are entities kept in external storage and must be retrieved into memory before computing. As a result, today’s application developers think by instinct the programming model based on storage and explicit data retrieval. This model, referred to as storage-based computing, plays an important role and has done a great job…

Continue

Added by Yuanjen Chen on October 31, 2013 at 7:24pm — No Comments

What is the difference between in-memory and in-place computing approach?

To be short, in-memory computing takes advantage of physical memory, which is expected to process data much faster than disk. In-place, on the other hand, fully utilizes the address space of 64bit architecture. Both are gifts from the modern computer science; both are essences of the BigObject. 

In-place computing only becomes possible upon the introduction of 64bit architecture, whose address space is big enough to hold the entire data set for most of cases we are dealing with today.…

Continue

Added by Yuanjen Chen on October 29, 2013 at 1:00am — 1 Comment

The BigObject - an Agile Analytic Engine for Big Data

Hi all,

This is my first post here. I'm glad to introduce this newly launched big data analytic engine, the BigObject. In the past 2 years we have been working on an optimal approach to handle big data for analytic purposes and challenging the existed models, some assumptions of which are no longer valid. For example, as the data size grows so rapidly, is it still practical that we stick to the relational models neglecting the time spending in data retrievals? What impact did…

Continue

Added by Yuanjen Chen on October 23, 2013 at 11:30pm — 2 Comments

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service