By JIANG Buxing, Data Scientist
Multidimensional analysis, commonly called OLAP, is an interactively data analysis process that performs operations, including rotation, slice and dice, drilldown etc., on a data cube. The structure of its back-end computation is simple, as shown by the following SQL:
SELECT D,..., SUM(M), ... FROM C WHERE D'=d' AND ... GROUP BY D,...
Added by JIANG Buxing on July 20, 2017 at 5:30pm — No Comments
Many folks believe that Hadoop is the…Continue
When designing a model for a data warehouse we should follow standard pattern, such as gathering requirements, building credentials and collecting a considerable quantity of information about the data or metadata. This helps to figure out the formation and scope of the data warehouse. This model of data warehouse is known as conceptual model. General elements for the model are fact and dimension tables. These tables will be related to each other which will help to identity relationships…Continue
According to Weisensee et al., Data warehouse architecture follows following principles:
ETL process is the foundation of BI. Success and failure of BI projects depends upon ETL process. It plays a vital role to integrate and enhance the worth of data. After the extraction, cleansing and arrangement…Continue
Added by Avesh Dhakal on May 20, 2014 at 12:30am — No Comments
The BigObject® - A Computing Engine Designed for Big Data
BigObject® presents an in-place* computing approach, designed to solve the complexity of big data and compute on a real-time basis. The mission of the BigObject® is to deliver affordable computing power, enabling enterprises of all scales to interpret big data. With the advances in what a commodity machine can perform, it…Continue
Added by Yuanjen Chen on November 20, 2013 at 5:29pm — No Comments
We have been using tables in the relational database, mostly for the transactional purposes, and that proves effective. Considering the data size and analytic purpose, however, the data structure might need to be redesigned for better efficiency.
To determine how to decompose the complexity of big data, we have observed the way the organisms function. In the physical world, the universe is organized into a hierarchy of…Continue
Added by Yuanjen Chen on November 3, 2013 at 10:29pm — No Comments
In general, computer scientists treats code and data in two very different ways. Virtual memory was originally developed to run big programs (code) in small memory, while data are entities kept in external storage and must be retrieved into memory before computing. As a result, today’s application developers think by instinct the programming model based on storage and explicit data retrieval. This model, referred to as storage-based computing, plays an important role and has done a great job…Continue
Added by Yuanjen Chen on October 31, 2013 at 7:24pm — No Comments
To be short, in-memory computing takes advantage of physical memory, which is expected to process data much faster than disk. In-place, on the other hand, fully utilizes the address space of 64bit architecture. Both are gifts from the modern computer science; both are essences of the BigObject.
In-place computing only becomes possible upon the introduction of 64bit architecture, whose address space is big enough to hold the entire data set for most of cases we are dealing with today.…Continue
Added by Yuanjen Chen on October 29, 2013 at 1:00am — No Comments
This is my first post here. I'm glad to introduce this newly launched big data analytic engine, the BigObject. In the past 2 years we have been working on an optimal approach to handle big data for analytic purposes and challenging the existed models, some assumptions of which are no longer valid. For example, as the data size grows so rapidly, is it still practical that we stick to the relational models neglecting the time spending in data retrievals? What impact did…Continue