Home » Technical Topics » Knowledge Engineering

Scene Graphs and Semantics

  • Kurt Cagle 
Tourism and travel in the summer. Vacations for the student. Work and travel. Caucasian young woman drinks coffee on the platform of the railway station against the background of the train
Copyright Adobe Stock.

It is nearly certain that, if you have ever played a 3D video game, watched a CGI-effects-laden movie, or seen increasingly hyperrealistic imagery, you have encountered a scene graph without realizing it. Scene graphs are pervasive in everything from media to medicine, from augmented reality to industrial digital twins, and they are increasingly playing an important role within the context of the metaverse.

At its simplest, a scene graph is the depiction of a scene that positions multiple digital objects within a given context. While called a graph, most scene graphs are technical speaking trees, with items positioned relative first to a root, and from there to one another. For instance, suppose that you had a scene in which a girl named Leila stood on a train platform, wearing a summer dress, a straw hat, and Mary Jane shoes over white socks, and she clutches a sketchpad in one hand. A train pulls up, the doors open, and the girl steps into the tube even as the doors close. A moment later, the train pulls away from the station, the girl moving in sync with the train, leaving the station empty of her presence.

The scene graph is in the particular case is actually broken up into nodes. The root node is the scene itself which may be self-contained, or may possibly be part of a larger graph. Within this scene you have the train platform, the walls, the tracks, and the tunnels. Since the girl’s motion is constrained to either the top of the platform or the train itself, she may actually be a child node of the platform (she doesn’t have to be, but by setting the platform’s surface to be at a height of 0, it makes positioning her much easier).

The girl similarly consists of her figure, perhaps her hair (swapping out hair is a common technique), her dress, hat, socks, and shoes. the latter overlapping the former. You can even break her figures into individual limbs, including shoulder, arm, hand, and fingers, the notebook clutched within and sharing roughly the same coordinate system as the hand.

When the train pulls into the station, this represents another object (or more properly another instance of type Train) that has its own external appearance and internal coordinate system, moving relative to the platform.

  • Root
    • Central Station
      • Platform C
        • Train 512
        • Girl – Leila
          • Hair 1
          • Dress 1
          • Body
            • Neck
              • Head
                • Hat
            • Shoulder
              • Arm
                • Hand
                  • Notebook

Notice that the relationships that exist are not necessarily containment relationships, but rather reflect changes in coordinate systems, such that in moving one such node in the graph, you are in effect changing the coordinate system of the moved object relative to its parent object. This makes sense, of course – if the girl swivels her neck, her head twists, as does the hat upon her head. If the girl moves from one location on the platform to another, everything in her particular scene graph moves relative to her.

This recursive node-graph is very useful, because it means that transformations that can be calculated as matrices multiplied by one another, operations that can be done very quickly on GPUs, though nowhere near as fast on more linear CPUs. It is part of the reason why extended reality (aka the metaverse and arguably Web 3.0) has as its underlying hardware platform the GPU, even as Web 1 and Web 2 were largely predicated upon the CPU. Rendering is mathematically intensive, and even rigging – the positioning and posing of 3D models within a scheme – can be compute-critical.

It’s also worth noting that each node is actual an instance – it is not just a Train pulling into the station, but it’s the 512 Line pulling into Central Station. This means that each node is conceptually unique (even if a node is duplicated, that node will have a different identifier and likely a different location in space). Even the Root node is unique, though it is analogous to a blank node in Turtle.

Girls on Train Platforms With Turtle Shaped Backpacks

Speaking of Turtle, the same scene graph can be rendered readily in the Terse RDF Language format as follows:

prefix : <https://www.example.com/SceneNode>
prefix Class: <https://www.example.com/Class>

:_Root a Class:_SceneNode;
  :hasLabel "Root"^^xsd:string;
  :hasChildNode ( :_CentralStation )
  .
:_CentralStation a Class:_SceneNode;
      :hasLabel "Central Station"^^xsd:string;
      :hasChildNode (  :_PlatformC )
      .
:_PlaformC a Class:_SceneNode;
      :hasLabel "Platform C"^^xsd:string;
      :hasChildNode (  
             :_Train512
             :_Girl-Leila )
             .
:_Train512 a Class:_SceneNode;
      :hasLabel "Train #512"^^xsd:string;
      .
:_Girl-Leila a Class:_SceneNode;
      :hasLabel "Girl - Leila"^^xsd:string;
      .

where the default namespace here is the sceneNode namespace (e.g., :_Train512 is actually short for SceneNode:_Train512. One interesting implication of all this is that any item within the scene graph is a scene node, regardless of what else it is.

This is a decomposed tree. If we create a new property called SceneNode:isSameAs which inherits from owl:sameAs, the tree can actually be built up using blank nodes:


prefix : <https://www.example.com/SceneNode>
prefix Class: <https://www.example.com/Class>

[ a Class:_SceneNode;
  :isSameAs :_Root;
  :hasLabel "Root"^^xsd:string;
  :hasChildNode ([ 
          a Class:_SceneNode;
         :isSameAs :_CentralStation;
         :hasLabel "Central Station"^^xsd:string;
         :hasChildNode ([
               a Class:_SceneNode;
               :isSameAs :_PlatformC;
              :hasLabel "Platform C"^^xsd:string;
              :hasChildNode ([
                      a Class:_SceneNode;
                      :isSameAs :_Train512;
                     :hasLabel "Train #512;
                     ] , [
                     a Class:_SceneNode;
                     :isSameAs :_Girl-Leila;
                    :hasLabel "Girl - Leila";
              ])
         ])
 )]
  .

Blank Nodes are not directly addressable (that is to say, they don’t actually have a valid URL) which is why you need to add the SceneNode:isSameAs property to make this useful as a structure, and in point of fact, this is largely syntactical sugar – the above Turtle will automatically convert into a set of assertion triples when loaded. This has some very useful implications, as it is not necessary to tunnel into a tree structure to find a node. For instance, from SPARQL, the following can find the identifier for “Girl Leila in a scene graph, even if that scene graph appears to be several hundred items deep:, and can do so efficiently:

prefix : <https://www.example.com/SceneNode>
prefix Class: <https://www.example.com/Class>

select ?girl where {
        ?bnode :isSameAs ?girl.
        ?bnode a Class:_SceneNode.
        ?bnode :hasLabel "Girl - Leila"^xsd:string.
        }

One other aspect that makes Turtle especially is the fact that when you have a tree, navigating along a particular relationship is simple. For instance, let’s say that you wanted to find the root node containing the girl and everything else. The relationship SceneNode:hasChildNode (shown above as :hasChildNode) can be traversed in reverse:



prefix : <https://www.example.com/SceneNode>
prefix Class: <https://www.example.com/Class>

select ?parentNode ?parentLabel where {
        ?bnode :isSameAs ?girl.
        ?bParentNode :hasChildNode ?bnode.
        ?bParentNode :isSameAs ?parentNode.
        ?bParentNode :hasLabel ?parentLabel.
        value ?girl ( :_Girl-Leila ) 
}

Thus, when given the value :_Girl-Leila this will return the tuple ( :_PlatformC, "Platform C" ). You can also extract the ancestor nodes for the tree that Leila is in:

prefix : <https://www.example.com/SceneNode>
prefix Class: <https://www.example.com/Class>

select ?ancestorNode ?ancestorLabel where {
        ?bnode :isSameAs ?girl.
        ?bAncestorNode :hasChildNode* ?bnode.
        ?bAncestorNode :isSameAs ?ancestorNode.
        ?bAncestorNode :hasLabel ?ancestorLabel.
        value ?girl ( :_Girl-Leila ) 
} 

Most Sparql engines will actually return this as a table with the first entry of the table being the initial node (the girl) and the final entry in the table the root node.

Taking a Train Trip

There is a great deal more than you can do here, especially if you are wanting to work with additional namespace metadata. For instance, elsewhere there may be another entry about the train in question. This bit of metadata likely is NOT directly in the scene graph, but that’s okay – the data can reside anywhere within the federated space of the graph. For now, let’s assume that all of the data is in the same graph (named graphs are worth their own article).

A train is a particular tricky thing to model, because there may be any number of physical trains that are used to go from point A to point B. Now we can spend a lot of time attempting to model the train and everything about it, but in reality what is more interesting here is that there is a train trip that is conducted by an otherwise anonymous train engine. Since I didn’t want to spend a lot of time modeling something that has only a minor impact, I went off to Schema.org, typed in Train, and TrainTrip and a few other related terms came up (https://schema.org/TrainTrip).

Rule #1 of ontologies – unless you MUST model, don’t. Someone has probably already done something like it, and you are as likely to find something on schema.org as anywhere.

From here, we can describe the train trip to the next station.

prefix : <https://www.example.com/SceneNode>
prefix schema: <http://www.schema.org/>

:_Train512 a schema:TrainTrip;
       schema:trainName "The Occident Express."^^xsd:string;
       schema:trainNumber "512"^^xsd:string;
       schema:departureStation :_CentralStation;
       schema:departurePlatform :_PlatformC;
       schema:arrivalStation :_WestStation;
       schema:arrivalPlatform :_PlatformE;
       schema:departureTime "08:15:00Z"^^schema:Time;  
       schema:arrivalTime "11:25:00Z"^^schema:Time;
       .

There are several things of note. The first is that we’re mixing two ontologies – a schemeNode ontology that specifies the arrangement of things relative to one another, and schema.org, which specifies metadata. What’s more, the same identifier SceneNode:_Train512 is actually defined as a class in two different ontologies for two different things – a scene node and a train trip. This kind of multi-classing makes sense because each ontology represents different models of the same thing (to a first approximation anyway). If you don’t like all the namespaces, you can also simplify it by changing the default namespace for the resource:

prefix SceneNode: <https://www.example.com/SceneNode>
prefix : <http://www.schema.org/>

SceneNode:_Train512 a :TrainTrip;
       :trainName "The Occident Express."^^xsd:string;
       :trainNumber "512"^^xsd:string;
       :departureStation SceneNode:_CentralStation;
       :departurePlatform SceneNode:_PlatformC;
       :arrivalStation SceneNode:_WestStation;
       :arrivalPlatform SceneNode:_PlatformE;
       :departureTime "08:15:00Z"^^:Time;  
       :arrivalTime "11:25:00Z"^^s:Time;
       .

This also showcases the segregation of information while still allowing for intermixing. The scene graph contains positional and orientation information (essentially transformation matrices) in a way that can be quickly accessed and transformed. This is not relevant information for the train traveler, however, even when she switches from one coordinate system to another (she walks from the platform into the arriving train).

  • Root
    • Central Station
      • Platform C
    • Train 512
      • Girl – Leila

In this particular case, the girl’s scene node is no longer parented to the Platform node, but rather is attached to the Train’s node. In essence, the girl has switched local scene graphs, and her movements are now circumscribed by the geometry of the train, not that of the station. When the train moves out, the compositor that controls the motion will move the girl relative to the inside of the train, and the coordinate system of the platform no longer makes as much sense.

In real life, you don’t actually jump from one screen graph to another. However, in games and VR environments, such jumps happen all the time, because the more that you can reduce the computational overhead, the smoother the illusion of animation becomes. It’s also worth noting that you can similarly reduce computation by treating windows and similar decorative spaces as being viewports through which you are seeing pre-rendered scenes. Note that the ability to disconnect a node graph from one parent node and attach it to another applies just as readily to exiting the train. In effect, you have even moved from a Station A Platform to the Train platform to Station B Platform by the process of using the same door to teleport between two disconnected scene graphs.

Scene graphs are an important, even crucial part of any extended reality scenario, including industrial IoT settings. Right now they aren’t very heavily tied into semantic systems, but they are a natural fit for them, especially when it comes to composing multiple IoT devices together. If, for instance, you’re designing a warehouse system with autonomous drones retrieving goods from specific boxes, the ability to model that space as individual components is critical, especially if you can then associate metadata to tie the actions of such components together. At an absolute minimum, this allows you to test specific configurations of shelving, robots, human access points, and containers to see which works and which is a recipe for disaster.

However, it’s not hard to go to the next step, in which you bind these digital twins to their physical counterparts through beacons, sensors, and actuators. It is this particular use of the “Metaverse” that may very well dominate over the next few years.