Subscribe to DSC Newsletter

Temporal Databases: Why you should care and how to get started (Part 3 of 3)

This is the third article in a series.  The first article is available here. The second article is available here.

Not every database requires a temporal database implementation, but some do. We can help you get started. As discussed in our previous two articles, the SQL-2011 standard included clauses for the definition of temporal tables as part of the SQL/Foundation. However, this standard is very new and not yet widely adopted. For now, most of you will need to extend your current database tables to incorporate temporal concepts. We have shown you that extending your current database tables is relatively easy.

In this article we’ll focus on the challenges in extending your current tables into temporal tables, and why implementing a true temporal database is easier.


The Challenges of Implementing a Temporal Table

  • Challenge #1. SQL lacks a time-range data type, so you have to use two different time columns to capture that information. As a consequence, both time columns must become part of the primary key, so they shouldn’t be allowed to be NULL. (You could use NULL, but in certain cases you may get in trouble. For example, read this blog: The Index You’ve Added is Useless. Why? .) To avoid a NULL value, the TO_DATE was given a non-null value of 9999-12-31 for rows that are currently valid in time.
    • Why a temporal database is easier -> Temporal databases support a time-range data type. The existence of this time range abstraction simplifies the design. Considerations, such as the possibility of a NULL in the from/to primary key column, are no longer relevant.
  • Challenge #2. Without the FROM_DATE and TO_DATE columns, the primary keys would be LOT_ID_NUM and PEN_Once you add the FROM_DATE and TO_DATE columns, the primary key must include four columns: FROM_DATE, TO_DATE, LOT_ID_NUM and PEN_ID. The same lot and pen could exist in different points in time; they are no longer unique for each row, and you need to include the time range in order to maintain uniqueness. Defining the FROM_DATE and TO_DATE columns as part of the primary key is a challenge with non-temporal databases. In a non-temporal database, the semantic meaning now is something like ”LOT_ID_NUM, PEN_ID, FROM_DATE and TO_DATE are the unique identifiers”.
    • Why a temporal database is easier -> A temporal database would allow you to define justLOT_ID_NUM and PEN_ID as the primary keys, which would have the semantic meaning of “LOT_ID_NUM and PEN_ID are the unique identifiers at any moment in time.”
  • Challenge #3. All queries must be time-aware. Even if you only want to query the last valid data in time, you still have to add a WHERE clause to make sure you get just the latest data. In general, all your queries will now be a little more complex (or a lot more complex).
    • Why a temporal database is easier ->
      • If you were using a temporal database, your tables would track time-varying validity by default. You would be querying the latest valid data (without a WHERE clause) by default.
      • You also would get access to other nice features such as “time breakpoints.” For example, you could set a breakpoint at 2012-12-01, and from that point on all your queries would be for data that was valid until 2012-12-01 only; for example, “select the pens that contained cattle from lot 55 before 2012-12-01”. Your breakpoint could also be a time range like TIME_RANGE(2012-11-01,2012-12-01); for example, “select the pens that contained cattle from lot 55 between 2012-11-01 and 2012-12-01”.
  • Challenge #4. All your updates are now more complex. Every update now involves changing one (or more) existing row(s) and creating one (or more) new row(s).
    • Why a temporal database is easier ->
      • A temporal database will automatically ensure that once a row is updated its prior information is not lost. In a temporal database you are not really updating rows but instead changing their time validity.
  • Challenge #5. In a temporal table, you don’t delete data. You change the time range in order to indicate that the data is not valid anymore. If different users and applications are accessing your tables, it is difficult to control and verify that everybody respects the assumptions for this table.
    • Why a temporal database is easier->
      • A temporal database will ensure that all users/applications executing queries against the “current valid time” will receive accurate results. Invalid/deleted data will not be returned.
      • A temporal database will automatically enforce the rules for a DELETE operation to ensure prior data is not lost, regardless of the client.

Real Temporal Database Solutions

Extending your current database tables to address temporal requirements is complex, but it will work. You would be able to re-wind your data to any point in the past accurately and completely.

However, using true temporal database tools will make your life much more simple. Vendor provided tools will enable you to take advantage of the real power and value of a temporal database. Many vendors provide extensions to their database products that allow you to have access to features that you would expect from a temporal database. In other cases, the latest commercially available version of the database provided by your database vendor may already include temporal features, and in that case, you don’t need any add-ons.

Oracle, PostgreSQL, Teradata, and IBM are among the few major database vendors that provide temporal support in one way or another. For example, DB2 and Oracle already include temporal features as part of the standard product version, but you need to read the documentation carefully; they are not necessarily compatible or 100% compliant with SQL:2011 because of syntax differences.

At Sullexis, we have experience with Oracle Workspace Manager. This particular Oracle extension lets you create “sessions.” These sessions represent data valid until a certain point of time, and they allow you to have many users working concurrently on the same data and making changes to their own copy, which later can be merged with other sessions or with the original tables. With Oracle Workspace Manager you can keep current, proposed, and historical versions of data in the same database. This is amazing for “what if” scenarios and other analysis tasks. Companies and application vendors that need to store session data in relational tables also heavily use this feature. For more information about Oracle Workspace Manager, read this white paper.

Summary

We have shown you the basic concepts and provided a design that would allow you to extend your database table to become temporal tables. The table design described in this article would have allowed Hudson Foods to “rewind” and query the data as it was at a certain point in time. The lack of this feature cost $25 million. In other industries or other kinds of incidents, this inability to rewind the data could result in even steeper penalties or even bankruptcy. Consider whether your organization needs this functionality. If it does, you can extend your current tables, or you can implement a true temporal database.

Views: 8421

Tags: Manager, Oracle, RDBMS, Snodgrass, Workspace, analytics, bitemporal, business, data, database, More…databases, intelligence, temporal, time, transactional, valid

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service