Logging transactions that dropped tables

In a previous post I discussed a way to find out which transaction dropped a table by examining the transaction log, in order to set a restore point to right before the table was dropped.

But what if we have the luxury of planning ahead (right? Well, let's call it the second time it happens?). Shouldn't we be able to log which transaction dropped a table, and use that? Of course we should.

The first thing one tries is then of course something like this in postgresql.conf:

log_statement='ddl'
log_line_prefix = '%t [%u@%d] <%x> '

to include the transaction id of the table. Unfortunately:

2017-02-12 12:16:39 CET [mha@postgres] <0> LOG:  statement: drop table testtable;

The 0 as a transaction id indicates that this command was run in a virtual transaction, and did not have a real transaction id. The reason for this is that the statement logging happens before the statement has actually acquired a transaction. For example, if I instead drop two tables, and do so in a transaction:

postgres=# BEGIN;
BEGIN
postgres=# DROP TABLE test1;
DROP TABLE
postgres=# DROP TABLE test2;
DROP TABLE
postgres=# COMMIT;
COMMIT

I get this interesting output:

2017-02-12 12:17:43 CET [mha@postgres] <0> LOG:  statement: DROP TABLE test1;
2017-02-12 12:17:45 CET [mha@postgres] <156960> LOG:  statement: DROP TABLE test2;

Which shows two different transaction ids (one real and one not) for statements in the same transaction. That's obviously not true - they were both dropped by transaction 156960. The transaction id just wasn't available at the time of logging.

So what can we do about that? Event triggers to the rescue!

Event triggers can fire on a number of events, one of them being a DROP TABLE. So let's see how we can leverage that.

First a quick reminder -- event triggers are per database, so if you have multiple databases you have to create your event triggers once in each database!

With event triggers we can capture the actual DROP TABLE statement, and log it either to the regular postgresql log (see caveat below) or even to a table. As usual with PostgreSQL, a trigger always calls a function (no in-line triggers), so let's start with a function:

CREATE OR REPLACE FUNCTION public.trg_log_drop()
 RETURNS event_trigger
 LANGUAGE plpgsql
AS $function$
DECLARE
  obj record;
BEGIN
  FOR obj in SELECT * FROM pg_event_trigger_dropped_objects() LOOP
    IF obj.object_type = 'table' THEN
      RAISE NOTICE 'TABLE % DROPPED by transaction % (pre-commit)',
                   obj.object_identity, txid_current();
      INSERT INTO public.droplog (tablename, dropxid)
             VALUES (obj.object_identity, txid_current());
    END IF;
  END LOOP;
END;
$function$

This trigger function will log all drop tables in two ways - it both to the system log (using RAISE NOTICE) and to a table called public.droplog. It logs only the most basic information, but it's of course also possible to add things like which user issued the drop etc. The table used for logging is just a standard table:

CREATE TABLE droplog (
  ts timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP,
  tablename text NOT NULL,
  dropxid bigint
)

And finally we have to create the event trigger itself:

CREATE EVENT TRIGGER table_drop_logger
  ON sql_drop
  WHEN TAG IN ('DROP TABLE')
  EXECUTE PROCEDURE public.trg_log_drop()

When rerunning the drop above after the event trigger has been created, it produces the following log:

2017-02-12 12:47:02 CET [mha@postgres] <156975> LOG:  TABLE public.testtable DROPPED by transaction 156975 (pre-commit)

And the following contents of the table:

postgres=# SELECT * FROM droplog;
              ts               │    tablename     │ dropxid 
───────────────────────────────┼──────────────────┼─────────
 2017-02-12 12:47:02.445561+01 │ public.testtable │  156975

Notice that little text saying (pre-commit). That is because the event trigger runs as part of the same transaction as the DROP TABLE, so if that transaction is rolled back the results of the event trigger will also be rolled back. Except for things like RAISE which log things outside the transaction context. So if I instead run:

postgres=# BEGIN;
BEGIN
postgres=# DROP TABLE test3;
DROP TABLE
postgres=# ROLLBACK;
ROLLBACK

I still get a lot of:

2017-02-12 12:52:30 CET [mha@postgres] <156978> LOG:  TABLE public.test3 DROPPED by transaction 156978 (pre-commit)

even though the transaction rolled back. But I get no entry for test3 in the table:

postgres=# SELECT * FROM droplog where tablename='public.test3';
 ts │ tablename │ dropxid 
────┼───────────┼─────────
(0 rows)

Obviously this example can be extended to cover other objects than tables if necessary, but it's usually the dropping of tables that people ask for in relation to "what transaction do I set my recovery_target_xid to". But it's simple to track multiple event tags and then just store which one it was in another column in the droplog table.


Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

PGConf.Asia
Dec 4-6, 2017
Tokyo, Japan
FOSDEM PGDay 2018
Feb 2-4, 2018
Brussels, Belgium
ConFoo 2018
Mar 7-9, 2018
Montreal, Canada
Nordic PGDay 2018
Mar 13, 2018
Oslo, Norway
PGDay.paris 2018
Mar 15, 2018
Paris, France

Past

2Q PGconf
Nov 6-7, 2017
New York, USA
PGConf.EU 2017
Oct 24-27, 2017
Warsaw, Poland
Inagural Oslo PUG meetup
Sep 12, 2017
Oslo, Norway
Postgres Open 2017
Sep 6-8, 2017
San Francisco, USA
PGDay.RU
Jul 5-7, 2017
St Petersburg, Russia
More past conferences