SQL2011Temporal

From PostgreSQL wiki
Jump to navigationJump to search

This page deals with a proposal for temporal features in PostgreSQL. It contains information about design and implementation.

(It is not fully compliant with SQL:2011)

System period temporal tables

System period temporal tables maintain historical versions of data. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables don't cause permanent changes to entries, but create new versions of them. Thus user can easily get to the past states of the table.

Following picture shows basic design of system period temporal tables. In next sections we will discuss each part in more details

Design.jpg

Current table

Current table stores current versions of rows. It has the same structure and name as original table before adding temporal support, except four technical columns (in this example begin with underscore):

  • _system_time – time range specifying when this version is valid. Versions in current table has no upper bound of validity.
  • _entry_id – id of entry. Groups multiple versions of same entry. Useful in cases when user doesn’t provide primary key for the table or in case of ID change (it will be impossible to find which versions belongs together).
  • _transaction_id – id of transaction that created this version. Serves for grouping of the versions created by the same transaction. It is also used for checking if row was previously modified by current transaction.
  • _user_id – id of user that executed modifying operation. Can serve for finding the originator of data changes.

Values for these columns are always generated by system and user cannot set them. To ensure backward compatibility, these columns can be marked as implicitly hidden. It means that they are not returned by the select * statement unless user explicitly list them and values are not required in case of the insert statement without column list (just like system columns).

Index on _system_time column is added for speed up queries for versions valid in specified time or time range.

History table

History table stores historical versions of rows, which are inserted by triggers on every modifying operation executed on current table. It has the same structure and indexes as current table, but it doesn’t have any constraints. History tables are insert only and creator should prevent other users from executing updates or deletes by correct user rights settings. Otherwise the history can be violated.

There is an internal inheritance-like relation between current and history table (current inherits history) to enable queries on whole history of rows (current and historical rows) by executing selects on history table.

History table cannot be dropped. It is dropped automatically when current relation is dropped.

Triggers

Triggers for storing old versions of rows to history table are inspired by referential integrity triggers. They are fired for each row after UPDATE and DELET and before each TRUNCATE statement. They use dynamically prepared statements which are stored in the hash table on first call and reused on subsequent calls in the session.

Operations

INSERT

Operation insert is simple. Row is inserted to current table with following values in system generated columns:

  • _system_time – tstzrange(clock_timestamp(), null, ‘[)’)
  • _entry_id – nextval from sequence
  • _transaction_id – txid_current()
  • _user_id – current_user

UPDATE

In case of update we need to ensure that old version of row is stored in history table only if it is first modifying operation on that row in current transaction. Otherwise the history may contain inconsistent states of rows (caused for example by use of deferred constraints) created inside the transaction, which will normally never be visible to other transactions.

These values are generated by system on update (_entry_id is not updated):

  • _system_time – case when _transaction_id = txid_current() then _system_time else tstzrange(clock_timestamp(), null, ‘[)’) end
  • _transaction_id – txid_current()
  • _user_id – current_user

Here is simplified example of trigger for storing old version of rows on update operation:

CREATE OR REPLACE FUNCTION update_history() RETURNS trigger AS
$$
BEGIN
    INSERT INTO person_history 
        VALUES(
            old.id,
            old.first_name,
            old.last_name,
            old.department_id
            tstzrange(lower(old._system_time), lower(new._system_time), ‘[)’)
            old._entry_id,
            old._transaction_id,
            old._user_id
        );
    RETURN null;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER update_history
    AFTER UPDATE ON person
    FOR EACH ROW
    WHEN _transaction_id <> txid_current()
    EXECUTE PROCEDURE update_history();

DELETE

On delete operation we only need to store rows being deleted into history table. Here is simplified example of trigger serving for this purpose:

CREATE OR REPLACE FUNCTION delete_history() RETURNS trigger AS
$$
BEGIN
    INSERT INTO person_history 
        VALUES(
            old.id,
            old.first_name,
            old.last_name,
            old.department_id
            tstzrange(lower(old._system_time), clock_timestamp(), ‘[)’)
            old._entry_id,
            old._transaction_id,
            old._user_id
        );
    RETURN null;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER delete_history
    AFTER DELETE ON person
    FOR EACH ROW
    EXECUTE PROCEDURE delete_history();

TRUNCATE

On truncate we only need to store all rows from table being truncated into history table. Here is simplified example of trigger serving for this purpose:

CREATE OR REPLACE FUNCTION truncate_history() RETURNS trigger AS
$$
BEGIN
    INSERT INTO person_history 
        (select
            id,
            first_name,
            last_name,
            department_id
            tstzrange(lower(_system_time), clock_timestamp(), ‘[)’)
            _entry_id,
            _transaction_id,
            _user_id,
         from person
        );
    RETURN null;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER truncate_history
    BEFORE TRUNCATE ON person
    FOR EACH STATEMENT
    EXECUTE PROCEDURE truncate_history();

DDL operations

DDL operations like ALTER TABLE, CREATE/DROP INDEX, GRANT, REVOKE etc. can be automatically propagated to history table using event triggers.

SQL Syntax extensions

Column options:

  • GENERATED ALWAYS AS – specifies that this column is generated by system and user cannot assign values to this column. Specific system generated columns:
    • SYSTEM TIME
    • ENTRY ID
    • TRANSACTION ID
    • USER ID
  • IMPLICITLY HIDDEN – implicitly hidden column is not returned by the select * statement unless user explicitly name it and values are not required in case of the insert statement without column list (just like system columns).

Command for associating history table with current table

  • ALTER TABLE <current table> ADD VERSIONING USE HISTORY TABLE <history table>

Example

CREATE TABLE person (
    id SERIAL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department_id INTEGER REFERENCES department(id),
    _system_time TSTZRANGE GENERATED ALWAYS AS SYSTEM TIME IMPLICITLY HIDDEN,
    _entry_id BIGINT GENERATED ALWAYS AS ENTRY ID IMPLICITLY HIDDEN,
    _transaction_id BIGINT GENERATED ALWAYS AS TRANSACTION ID IMPLICITLY HIDDEN,
    _user_id INTEGER GENERATED ALWAYS AS USER ID IMPLICITLY HIDDEN 
);
CREATE TABLE person_history(LIKE person);
ALTER TABLE person ADD VERSIONING USE HISTORY TABLE person_history;