Back to the Temporal Records (in Relational Databases)
I really don't like to leave things stewing in my mind for too long, so I've been trying to resolve some of the issues raised in my previous blog post. The primary issue resides in trying to model data generally.
Some tables in the database for Journey will have time values included: either zero or more single time values and zero or more pairs. Using a database as a repository of facts, the default assumption is that if a value is currently stored it is true now but no assertion is made as to whether that value was true in the past, will be true in the future, or is true over any interval of time other than this very instant.
The issue arises that we need to store historical data, so we can differentiate what is true and what was true. We also need to be able to enter data about what will be true in the future. The presence of temporal data makes the question of "when" clear for data that it is associated with, but muddies the water for any data that lacks the temporal quality.
One thing that we do keep is an audit of changes. The history of values for all fields is kept (including who made the change). This history may be truncated as space necessitates, but it is one way to track previous values. One application for this is user directed "undo". This is not related to atomic transactions, the database technology handles those. The issue with the audit table is it makes no distinction between two very different types of changes to a given record: correction of an error vs. updating facts.
The book I mentioned in my previous post, Temporal Data and the Relational Model (Date, 2003), covers temporal data as it relates to updating facts and keeping track of past and future facts. The assumption here too is that correction of error is as simple as updating the record. The relational model does not intrinsically handle the case to tracking all previous "incorrect" values, though it can certainly support a problem domain where that is needed. The approach for "true" data over time in the book is thorough.
There are several temporal properties about data that we care about for Journey, these have been somewhat revised since the last post:
- Current - The record represents a truth that is currently applicable. This condition is mutually exclusive with Past and Future. It is implied by Past Pending.
- Past Pending - The record represents a truth that is current and there is a known ending time.
- Past - The record represents something that was true at some point in the past, but is no longer true. Mutually exclusive with Current and Future.
- Future - The record represent something that will be true at some point in the future, but is not yet true. Mutually exclusive with Past and Current.
- Start Indefinite - The record has no start date.
- End Indefinite - The record has no end date.
- Ubiquitous - The record is always true.
To better answer the question of "how to differentiate between records with no temporal data and those with no start" I decided to redefine the use of "no temporal data" to be more in line with the default assumption in relational databases: currently true. More specifically, if no temporal data is associated with a record, is is assumed currently true recognizing that it could change at any time. In contrast, there is a specific case where data associated with both an start and end date can be considered ubiquitously true ,though it may change in the sense that it can of course be corrected.
The limitation of this method in expressing time is that it is not directly possible hold to both the tenant that Nulls should not be stored while also allowing the database to express that a record exists with while not being true at any particular time. This limitation is not a problem in Journey, and is generally possible to model indirectly in other ways.
The truth tables above include options for Null (n) in both the start and end time value. This does not indicate a stored Null, but rather the absence of a column for that purpose. Null is treated differently than "Unknown" (u) in that Unknown specifically means "always" in the past or future, i.e. until the beginning of time for start fields and until the end of time for end fields.
Unillustrated in the first post, the table on the right shows values returned when comparing two times. Null (n), Unknown (u), Past (p), Current (c), Future (f).
The chart below illustrates the results comparing a single point in time (presumably the current time) to a range. The results column "u" in the chart below is "Ubiquitous", not "Unknown".

The semantics have changed slightly. Start Indefinite (previously Past Indefinite), and End Indefinite (previously Future Indefinite) always indicate a null or unknown value in their respective columns. Like Past Pending, Ubiquitous is a special case of Current.
The implication of this design is that not specifying a start date results in storing the fact as having always been true at least to the present, similarly not specifying an end date implies that it will continue to be true in perpetuity. If a record is to be limited in time either the start or end date, if not both, must be set when the record is created to avoid creating misinformation. This can be best avoided by not allowing a default value when the intention is to be restrictive temporally. Conversely, using the default value for both start and end date results in a record that is explicitly current and semantically identical to a record from a table that keeps no temporal data.
This helps clarify some of the issues that were left hanging from my last post. I feel a lot more comfortable with this redefinition of the application of time (or lack there of) to database relations.