The internet has been improved drastically since its origins of being used in university supercomputers to collaborate on research projects. It has revolutionised human behaviour - changing the way we communicate, conduct business, create and consume information. It has become very sophisticated and at the same time very convenient to use. Indeed many modern apps and conveniences rely on the internet without us often realising.
The internet is a global computer network that is interlinked through URLs or a Uniform Resource Locator to access information stored on a different computer. Often a user does not know the exact URL directed to the information he/she wants. To find this information we use search engines such as Google or Bing where we search indexes to find relevant information. These rely on extracted keywords and other metrics such links with other sites and visitor numbers. Although this is a great breakthrough in technology, computers and search engines still struggle to understand the subtleties around meaning and context that often determine whether something is relevant or not. Machine learning and semantics are being developed to solve this problem of machines understanding context, meaning, relationships and other semantically challenging concepts.
A cornerstone of the semantic web is its use of newer graph-based approaches and technologies – such as the RDF and SPARQL W3C initiatives. Given the internet is a giant web of connected data this model works well compared to traditional relational techniques where it has been necessary to structure data in ways less geared to showing complex relationships such as hierarchies.
RDF or Resource Description Framework is a way of describing data in a way that it can be queried using the SPARQL language. Using a RDF framework is beneficial as instead of just relying on keywords search engines and browsers can also read the RDF files and understand the concepts references and the context of a web page – for example whether it is about a book, product, company or many other things a user might be searching for. The Resource Description Framework works in a simple way of dividing information into triples: subject – predicate – object. For example Robert was born in 1986, in this case Robert is the subject, was born is the predicate and the object is 1986. These triples then can have relationships with each other providing even more related information or Linked Data as it is often known. This data can then be easily accessed by using SPARQL to traverse the Knowledge Graph - for example Robert who was born in 1986 lives in London and has three children, Robert knows John, John has 1 child and so on. Flexibility is king with RDF and adding new relationships or concepts is much easier than traditional techniques such as adding rows and columns into a table.
Semantics is still in early stages of its lifecycle. As the need for machine learning grows it will change the way we organise, search and interact with data. It will drastically increase the scale, efficiency and capabilities providing users with an intuitive, tailored and insight rich snapshot of a dataset.
For more interesting articles and innovative data consulting services please visit our website.