Up to about 1999 web search engines evaluated each web page as a standalone entity, ranking them based on content without regard to any other pages. But in 1999 Google adopted PageRank, a graph-centered approach invented by co-founder Larry Page. PageRank evaluates web pages in relationship to other pages. Users quickly recognized that ranking pages based on their relationship to others resulted in much better recommendations and may be the single factor that moved Google rapidly ahead of its competitors.
We are so used to thinking of databases as tables or at least buckets of information that it can be a little challenging to wrap your head around the concepts of graph databases. That said, Graph DBs can do things that none of the other types of NOSQL or RDBMS DBs can do. Making the effort to understand and utilize this type can offer big returns.
There are no classical indexes for Graph DBs. Rather, each object stored is mapped with “nodes” and “edges”. A node is a single record that has at least one and potentially many named properties. Edges define the relationship among nodes and both the nodes and their relationships have some predefined properties. Nodes can have multiple edges defining many different kinds of relationships they have with other nodes. Both nodes and relationships (edges) can be addressed with key values.
Search or query with Graph DBs is called “traversal”. These queries are designed to start at a specific node and explore its relationship with other nodes based on the relationships requested. A common example would be ‘what books are my friends reading that I haven’t yet read’. In this mode, Graph DBs are often associated with ‘recommender’ engines widely used in social and ecommerce applications.
As Graph DBs become more dense, traversal search may require stopping at the same node several times which can slow the search. As a result Graph DBs learn and index these common relationships to speed up search.
Particular Opportunities and Project Characteristics
Some sample use cases:
July 23, 2014
Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2014, all rights reserved.
About the author: Bill Vorhies is President & Chief Data Scientist of Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001. He can be reached at:
This original blog can be viewed at:
All nine lessons can be downloaded as a White Paper at: