In a previous post I discussed a way to find out which transaction dropped a table by examining the transaction log, in order to set a restore point to right before the table was dropped.
But what if we have the luxury of planning ahead (right? Well, let's call it the second time it happens?). Shouldn't we be able to log which transaction dropped a table, and use that? Of course we should.
The first thing one tries is then of course something like this in postgresql.conf
:
log_statement='ddl'
log_line_prefix = '%t [%u@%d] <%x> '
to include the transaction id of the table. Unfortunately:
2017-02-12 12:16:39 CET [mha@postgres] <0> LOG: statement: drop table testtable;
The 0
as a transaction id indicates that this command was run in a virtual transaction, and did not have a real transaction id. The reason for this is that the statement logging happens before the statement has actually acquired a transaction. For example, if I instead drop two tables, and do so in a transaction:
postgres=# BEGIN;
BEGIN
postgres=# DROP TABLE test1;
DROP TABLE
postgres=# DROP TABLE test2;
DROP TABLE
postgres=# COMMIT;
COMMIT
I get this interesting output:
2017-02-12 12:17:43 CET [mha@postgres] <0> LOG: statement: DROP TABLE test1;
2017-02-12 12:17:45 CET [mha@postgres] <156960> LOG: statement: DROP TABLE test2;
Which shows two different transaction ids (one real and one not) for statements in the same transaction. That's obviously not true - they were both dropped by transaction 156960
. The transaction id just wasn't available at the time of logging.
So what can we do about that? Event triggers to the rescue!
PostgreSQL can, as many other products, use SSL to secure client/server communications. It can be configured to be optional or required, to require a client certificate, or even to use the SSL client certificate to log in. However, the DBA tools are currently a bit lacking in this regard. In particular, there is no way for a DBA to see what SSL parameters are in force (or indeed if SSL is enabled at all) for an already existing connection.
There are multiple ways to see the status of a specific connection (such as the libpq PQgetssl() function, the psql startup message or the sslinfo module. Unfortunately all these methods have one thing in common - they are only available to the process making the connection to the database, not to a DBA inspecting the system from the outside.
9.4 will make it a little bit better, because log_connections now include SSL information when the user connects, similar to:
LOG: connection authorized: user=mha database=postgres SSL enabled (protocol=TLSv1.1, cipher=ECDHE-RSA-AES256-SHA)
But this can still be a bit of a pain to figure out for existing connectioons of course.
To deal with this problem, I've written a small PostgreSQL extension called pg_sslstatus. When loaded using shared_preload_libraries it will keep track of the SSL status for each connection, and expose it through a system view named pg_sslstatus like this:
postgres=# SELECT * FROM pg_sslstatus;
pid | ssl | bits | compression | version | cipher | clientdn
-------+-----+------+-------------+---------+----------------------+----------------------------------------------------------
27286 | t | 256 | f | TLSv1 | ECDHE-RSA-AES256-SHA |
26682 | t | 256 | t | TLSv1 | ECDHE-RSA-AES256-SHA | /C=AU/ST=Some-State/O=Internet Widgits Pty Ltd/CN=magnus
26693 | f | | | | |
(3 rows)
It will include one row for each connection, and can then be joined with either pg_stat_activity or pg_stat_replication to view the SSL data in a bigger context:
postgres=# SELECT datname, usename, client_addr, ssl, cipher FROM pg_stat_activity INNER JOIN pg_sslstatus ON pg_stat_activity.pid=pg_sslstatus.pid;
datname | usename | client_addr | ssl | cipher
----------+---------+-------------+-----+----------------------
postgres | mha | 127.0.0.1 | t | ECDHE-RSA-AES256-SHA
postgres | mha | | f |
pgweb | pgweb | | f |
(2 rows)
The extension is available from my github account today for existing version of PostgreSQL, under the standard PostgreSQL license. My plan is to work on having a similar view included in PostgreSQL 9.5 by default, but it's unfortunately too late to include it in 9.4 at this point.
PostgreSQL 9.1 beta1 now available - now is a great time to start testing it, and trying out all the great new features.
There have always been a number of ways to read your PostgreSQL logs remotely, over a libpq connection. For example, you can use the pg_read_file() function - which is what pgadmin does. PostgreSQL 9.1 adds a new and more convenient way (in some ways) to do this - using SQL/MED.
PostgreSQL 9.1 comes with SQL standard SQL/MED functionality. The MED in is short for "Managemend of External Data", and as the name sounds, it's about accessing data that's external to the PostgreSQL server. The SQL/MED functionality is not (yet) complete, but it's already very useful in it's current state.
In SQL/MED, there is something called a Foreign Data Wrapper, that can be compared to a driver. Using this FDW, we can create one or more Foreign Servers, which is a definition of how to connect to a specific instance of the service - if any. Finally, we can create one or more Foreign Tables on each of the Foreign Servers, giving us direct access to the remote data using SQL.
Last week I had yet another customer issue where "someone" had been issuing DDL statements in the database. And nobody knew who. Or why. But (surprise!) it broke things (and they weren't even running Slony!). There are two simple lessons to be learned from this:
In a production environment, arbitrary DDL statements are normally not run. If they are, you really need to look over your application design, because it's broken. Note that this does not include temporary tables. Also things like automating the creation of new partitions are also pretty normal. But the important thing there is that it's controlled and scheduled work, not arbitrary statements.
So, you'll want to keep track of your DDL. PostgreSQL provides a very simple and good way to do this. Set the configuration parameter log_statement='ddl'. The default value for this parameter is none, and there are also options for logging all DML and all statements period. But for a production environment, I find the ddl option to be very useful. So useful, in fact, that I'd consider it an installation bug in most environments if it's not set. So if this parameter is not set in your production environment, now is a good time to reconsider that decision.
The second thing to learn comes from the fact that once we tracked it down, it turned out that the DDL was issued from the application server. Which was running with superuser privileges. Now that's a much larger bug in the deployment, and a failure waiting to happen. There's a very simple lesson to learn from this: the application server should never run with superuser privileges. It should also not run with a user that has permissions to issue any DDL. This is simply the principle of least privilege - or at least principle of not insanely high privileges.
Yes, there are a number of application servers and frameworks that issue their own DDL as part of their ORM. The best way to handle them is, IMHO, to have them generate the SQL output and then manually apply that using a high privilege account. Because DDL should only be issued as part of upgrades and similar things, this should not be an issue. If the application server does not support this, a workaround is to give the application server DDL permissions during the upgrade only, and then take them away as soon as the upgrade is completed.
And yes, you should do this on your developer systems as well, and not just in production. Because if you only do it in production, you won't notice your bugs until you have deployed. It may seem like a lot of extra work to begin with, but it really is only a little extra work once you have got the procedures in place. And it can save you a lot of forensics work once something has happened.