Subscribe to DSC Newsletter

11 Core Big Data Workload Design Patterns

As big data use cases proliferate in telecom, health care, government, Web 2.0, retail etc there is a need to create a library of big data workload patterns. . We have created a big data workload design pattern to help map out common solution constructs. There are 11 distinct workloads showcased which have common patterns across many business use cases.

  1. Synchronous streaming real time event sense and respond workload
  2. Ingestion of High velocity events - insert only (no update) workload
  3. High node Social graph traversing
  4. ‘Needle in a haystack’ workloads
  5. Multiple event stream mash up & cross referencing events across both streams
  6. Text indexing workload on large volume semi structured data
  7. Looking for absence of events in event streams in a moving time window
  8. High velocity, concurrent inserts and updates workload
  9. Semi & Unstructured data ingestion
  10. Sequence analysis workloads
  11. Chain of thought  workloads for data forensic work

 

Data Workload-1:  Synchronous streaming real time event sense and respond workload

It essentially consists of matching incoming event streams with predefined behavioural patterns & after observing signatures unfold in real time, respond to those patterns instantly.

Let’s take an example:  In  registered user digital analytics  scenario one specifically examines the last 10 searches done by registered digital consumer, so  as to serve a customized and highly personalized page  consisting of categories he/she has been digitally engaged. Also depending on whether the customer has done price sensitive search or value conscious search (which can be inferred by examining the search order parameter in the click stream) one can render budget items first or luxury items first

Similarly let’s take another example of real time response to events in  a health care situation.  In hospitals patients are tracked across three event streams – respiration, heart rate and blood pressure in real time. (ECG is supposed to record about 1000 observations per second). These event streams can be matched for patterns which indicate the beginnings of fatal infections and medical intervention put in place

 

10 more  additional patterns are showcased at

 http://blog.fluturasolutions.com/2012/08/11-core-big-data-workload-...

 

These Big data design patterns are template for identifying and solving commonly occurring big data workloads. The big data workloads stretching today’s storage and computing architecture could be human generated or machine generated. The big data design pattern may manifest itself in many domains like telecom, health care that can be used in many different situations. But irrespective of the domain they manifest in the solution construct can be used. Big data patterns also help prevent architectural drift. Once the set of big data workloads associated with a business use case is identified it is easy to map the right architectural constructs required to service the workload - columnar, Hadoop, name value, graph databases, complex event processing (CEP) and machine learning processes

10 more additional patterns are showcased at

 http://blog.fluturasolutions.com/2012/08/11-core-big-data-workload-...

 

It is our endeavour to make it collectively exhaustive and mutually exclusive with subsequent iteration.

As Leonardo Vinci said “Simplicity is the ultimate sophistication” …. Big data workload design patterns help simplify the decomposition of the business use cases into workloads. The workloads can then be mapped methodically to various building blocks of Big data solution architecture. Yes there is a method to the madness J

Views: 9012

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service