Event Triggers

From PostgreSQL wiki
Jump to navigationJump to search

Event Triggers

The Event Trigger patch series main goal is to allow for users to tweak DDL commands without having to hook in ProcessUtility(). The design has been made more general so that we can support important use cases.

Main advantages over ProcessUtility hooks

The first version of Event Triggers has very few technical capabilities to offer that we don't already have available in the form of ProcessUtility hooks, so, what's the deal here?

  • You can write event triggers functions in PL/pgSQL, no need to write them in C, build an extension, ship it to the server's file system and finally install it;
  • You can actually list the currently installed Event Triggers by doing \dy in psql;
  • In single user mode the Event Triggers are disabled, staying out of the way while you're busy fixing your production services;
  • It's possible to install more than one Event Trigger then decide as the DBA in which order to run the Event Triggers without having to decipher how the code is implemented and in which order the shared object librairies are going to be _Init()ialized.

Use Cases

We attempt at being able to solve those use cases:

  • Logical Replication
  • Auditing
  • DDL support for Extensions
  • Impact analysis/performance

Logical Replication

We want to record DDL changes in a way that allows it to be replayed elsewhere in a flexible way. So we need access to both the original text and the fully qualified text once all search paths are resolved, making both available to allow replication to decide which to use. We don't need to fire actions for every conceivable sub-statement, only enough that we can accurately reproduce what occurred. So for example, a DROP CASCADE might just need to be replayed as a DROP CASCADE. Query text is available from utility hooks, but fully cooked SQL event info is required.

Auditing

We need to be able to record Who did What to Whom/WhichObject, When they did it and whether it succeeded. So we need access to userid, username, actioncategory, actiontext, subjectcategory, subjectid, action time, rc

Performance

How long did we hold locks for? How long did the action take?

Expected Features

The first release of Event Triggers is landing in PostgreSQL 9.3 and we will have a restricted set of features in that release.

Features

So we want to provide those features:

  • Event Triggers that run either before or after a DDL command
  • User Functions should be provided with detailed information:
    • event name
    • command tag
    • operation (CREATE, ALTER or DROP)
    • object kind (TABLE, FUNCTION, VIEW, etc)
    • object OID
    • object name
    • schema name where the object lives
  • support for DROP CASCADE
  • support for dropping multiple objects
  • support for generated commands
Essentials versus Desirables

Some of these items are essential; others may not be.

For instance, the operation, object kind, and object OID seem pretty essential; it will be mighty difficult to do interesting things with the event if you do not know the nature of the event.

On the other hand object name and schema name are of more ambiguous necessity, as they may be queryable from the object's tables in pg_catalog.

  1. If an object is being dropped, then one may get its name/schema from pg_catalog in the ddl_command_start event, but once the process reaches the ddl_command_end event, that data will have been removed from pg_catalog.
  2. If an object's name is being altered, then the data in pg_catalog for name/schema will be different at ddl_command_start than it later is when the process reaches ddl_command_end. And in that case, the name/schema values that are to be returned may differ between the ddl_command_(start|end) events; what values are to be provided in each event represent a semantic that should get nailed down.

Commited patches from the series

Here's the list of what we already have:

Features still in the work for 9.3

The main features we need and which are not yet commited in 9.3 are:

  • No specific information about the objects that the command target
  • No access to the parse tree or the command string, raw or normalized
  • No support for multiple targets per command
    • only DROP can target more than one object at a time
    • and DROP CASCADE is another special case of that
  • No support for generated commands

The next point is detailing where the discussion is at about those features that we didn't commit yet.

Discussion

Some of the features are still under discussion in order to find the best design we can implement them.

As a general note it's useful to recall that most of those features have already been coded at least once (some of them received one or more redesign and refactoring following review in the 9.2 and 9.3 cycles). So it's not about adding new features that we didn't see coming before, it's about finishing a patch series to match a subset of the features proposed a couple of years ago.

The goal is to refine which subset we are ready to accept and to evaluate if we can ship with that subset as something that makes sense on its own.

For example, with no information available at all about the specifics of the objects that a command is targeting, it's hard to see which use case you can actually solve with Event Triggers.

generated and internal commands

For context, the following command will actually run several operations:

CREATE TABLE foo(id serial primary key, f1 text);

We will have CREATE SEQUENCE then a new CREATE TABLE statement using the name of the sequence we just created, then ALTER SEQUENCE and finally a CREATE INDEX command. The current implementation of PostgreSQL backend is to form a parse node from scratch and send that to ProcessUtility() again, in a recursive way.

The current patch proposal is to expose those with a CONTEXT of GENERATED that is not matched by default by any Event Trigger, and allow the user to opt-in when he's interested into that level of implementation detail.

It might not be very wise to make that implementation detail visible to users because that would make us unable to clean the mess later, should any developer has round tuits or motivation to go about that some day.

One idea is to only expose users who are willing to code their event trigger in C, and another idea is then to only expose them via a hook. The problem with the hook approach is that you need to build and expose the exact same amount of information.

DROP CASCADE / DROP OWNED support

A patch has been sent to allow to call an Event Trigger for each object that is dropped as part of a command: http://www.postgresql.org/message-id/m2fw1ieq5x.fsf@2ndQuadrant.fr

How to expose Information to Event Triggers Functions

The current proposal is to expose some magic PL variables. The already commited code exposes TG_EVENT and TG_TAG.

We're proposing to add to that the most common pieces of information as variables, adding to them those that are really cheap to build (it's a constant string):

  • Variables
    • TG_OBJECTID
    • TG_OBJECTNAME
    • TG_SCHEMANAME
    • TG_OPERATION
    • TG_OBTYPENAME
    • TG_CONTEXT
  • Accessors
    • pg_get_event_command_string()

About TG_CONTEXT, see previous point.

Command String Normalisation

The main use case for the command string is Logical Replication. In that use case, it's very important to know off hand in which schemas the objects are created (resp. dropped, altered) and to some extend which names are given to automatically named objects (indexes, constraints, sequences).

See

Features For Next Releases

Some features we intended to provide already in 9.2 will have to wait until we are ready, which will not be the case in the 9.3 timeframe.

INSTEAD OF

The idea is to be able to install an Event Trigger that takes control over an existing command, to be able to re-define it, maybe in terms of the command itself:

create event trigger my_create_extension instead of 'create extension'
   execute procedure my_create_extension();
create function my_create_extension()
        returns event_trigger
       language plpgsql
as $$
begin
   alter event trigger my_create_extension disable;
   -- do some stuff here
   create extension tg_objectid;
   -- do some more stuff here, presumably
end;
$$;

Unfortunately it's already clear that the INSTEAD OF feature implementation is too hard to get right from the first release of Event Triggers, because the call points in the backend code must be really carefully placed.

table_rewrite

The idea is to provide an event called table_rewrite that will fire any time a command that will rewrite the whole of the table is to be executed, so that the DBA can install a local policy about that (accept the rewrite only at night when it's not a full moon, say).

create table on insert

Some developers are getting used to schema less databases nowadays, and want to be have the behaviour that when they do

INSERT INTO look_me_i_dont_exist(key, value) VALUES(1, 'foo');

The table look_me_i_dont_exist is automatically created by PostgreSQL with two columns key of type integer (presumably) and value of type text here.