For anybody following this blog, you'll know I do this every year. PGConf.EU completed several weeks ago, and we have now collected the statistics, and I'd like to share some numbers.
Let me start with some statistics that are not based on the feedback, but instead based on the core contents of our registration database. I've had several people ask exactly how we count our attendees when we say it's the largest PGConf.EU ever, so here are the numbers:
Our total number of attendees registered was 437. This includes all regular attendees, speakers, training attendees, sponsor tickets and exhibitor only tickets. Of these 437 people, 12 never showed up. This was a mix of a couple of sponsor tickets and regular attendees, and 3 training attendees. This means we had 425 people actually present.
We don't take attendance each day. Right after the keynote on the first day there were just over 20 people who had not yet shown up, and by the end of the conference the total that number was down to 12. There were definitely fewer than 400 people who remained on a late Friday afternoon for the closing sessions, but at lunchtime the crowd was approximately the same size.
On top of the 437 actual attendees, we also had 5 further sponsor tickets that were never claimed. And we had another 59 people still on the waitlist, since we were unfortunately up against venue limits and we not able to sell all the requested tickets.
As Joe just announced, all ftp services at ftp.postgresql.org has been shut down.
That of course doesn't mean we're not serving files anymore. All the same things as before area still available through https. This change also has an effect on any user still accessing the repositories (yum and apt) using ftp.
There are multiple reasons for doing this. One is that ftp is an old protocol and in a lot of ways a pain to deal with when it comes to firewalling (both on the client and server side).
The bigger one is the general move towards encrypted internet. We stopped serving plaintext http some time ago for postgresql.org, moving everything to https. Closing down ftp and moving that over to https as well is another step of that plan.
There are still some other plaintext services around, and our plan is to get rid of all of them replacing them with secure equivalents.
When you create a table in PostgreSQL, it gets assigned default permissions and a default owner. We can alter the default privileges using the very useful ALTER DEFAULT PRIVILEGES command (a PostgreSQL extension to the standard). However, there isn't much we can do about the owner, which will get set to the role that is currently active. That is, it's the main login role, or another role if the user has run the SET ROLE
command before creating the table.
A fairly common scenario that is not well handled here is when a number of end-users are expected to cooperate on the tables in a schema all the way, including being able to create and drop them. And this is a scenario that is not very well handled by the built-in role support, due to the ownership handling. Fortunately, this is something where we can once again use an event trigger to make the system do what we need.
In a previous post I discussed a way to find out which transaction dropped a table by examining the transaction log, in order to set a restore point to right before the table was dropped.
But what if we have the luxury of planning ahead (right? Well, let's call it the second time it happens?). Shouldn't we be able to log which transaction dropped a table, and use that? Of course we should.
The first thing one tries is then of course something like this in postgresql.conf
:
log_statement='ddl'
log_line_prefix = '%t [%u@%d] <%x> '
to include the transaction id of the table. Unfortunately:
2017-02-12 12:16:39 CET [mha@postgres] <0> LOG: statement: drop table testtable;
The 0
as a transaction id indicates that this command was run in a virtual transaction, and did not have a real transaction id. The reason for this is that the statement logging happens before the statement has actually acquired a transaction. For example, if I instead drop two tables, and do so in a transaction:
postgres=# BEGIN;
BEGIN
postgres=# DROP TABLE test1;
DROP TABLE
postgres=# DROP TABLE test2;
DROP TABLE
postgres=# COMMIT;
COMMIT
I get this interesting output:
2017-02-12 12:17:43 CET [mha@postgres] <0> LOG: statement: DROP TABLE test1;
2017-02-12 12:17:45 CET [mha@postgres] <156960> LOG: statement: DROP TABLE test2;
Which shows two different transaction ids (one real and one not) for statements in the same transaction. That's obviously not true - they were both dropped by transaction 156960
. The transaction id just wasn't available at the time of logging.
So what can we do about that? Event triggers to the rescue!
For a while now, I've been annoyed with how difficult it is to set up good backups in PostgreSQL. The difficulty of doing this "right" has pushed people to use things like pg_dump for backups, which is not really a great option once your database reaches any non-toy size. And when visiting customers over the years I've seen a large number of home-written scripts to do PITR backups, most of them broken, and most of that breakage because the APIs provided were too difficult to use.
Over some time, I've worked on a number of ways to improve this situation, alone or with others. The bigger steps are:
For 10.0, there are a couple of new things that have been done in the past couple of weeks:
As we say welcome to a new year, we have a couple of updates to the finances and payment handling in PostgreSQL Europe, that will affect our members and attendees of our events.
First of all, PostgreSQL Europe has unfortunately been forced to VAT register. This means that most of our invoices (details below) will now include VAT.
Second, we have enabled a new payment provider for those of you that can't or prefer not to use credit cards but that still allows for fast payments.
A few weeks back, I noticed the following tweet from Michael Paquier:
And my first thought was "that can't be right" (spoiler: Turns out it wasn't. But almost.)
The second thought was "hmm, I wonder how that has actually changed over time". And of course, with today being a day off and generally "slow pace" (ahem), what better way than to analyze the data that we have. The PostgreSQL mailinglist archives are all stored in a PostgreSQL database of course, so running the analytics is a quick job.
Today, Planet PostgreSQL was switched over from http to https. Previously, https was only used for the logged in portions for blog owners, but now the whole site uses it. If you access the page with the http protocol, you will automatically be redirected to https.
As part of this, the RSS feeds have also changed address from http to https (the path part of the URLs remain unchanged). If your feed reader does not automatically follow redirects, this will unfortunately make it stop updating until you have changed the URL.
In a couple of days we will enable HTTP Strict Transport Security on the site as well.
We apologize for the inconvenience for those of you who have to reconfigure your feeds, but we hope you agree the change is for the better.
It is now about a week since PGConf.EU 2016, and things are slowly returning to normal :) You'll have to wait a while longer for the traditional summary of the feedback post that I make every year, but there's another piece of statistics I'd like to share.
As always, Dave put the attendees per country statistics into the closing session slides, and we shared some of the top countries. Unsurprisingly, countries like Estonia (the host country), Germany (one of Europes larges country), Sweden and Russia (countries near by) were at the top.
For those looking into more details, here is the actual statistics for all countries and not just the top ones (click for bigger version)
A common example when talking about why it's a good thing to be able to do PITR (Point In Time Recovery) is the scenario where somebody or some thing (operator or buggy application) dropped a table, and we want to do a recover to right before the table was dropped, to keep as much valid data as possible.
PostgreSQL comes with nice functionality to decide exactly what point to perform a recovery to, which can be specified at millisecond granularity, or that of an individual transaction. But what if we don't know exactly when the table was dropped? (Exactly being at the level of specific transaction or at least millisecond).
On way to handle that is to "step forward" through the log one transaction at a time until the table is gone. This is obviously very time-consuming.
Assuming that DROP TABLE
is not something we do very frequently in our system, we can also use the pg_xlogdump
tool to help us find the exact spot to perform the recovery to, in much less time. Unfortunately, the dropping of temporary tables (implicit or explicit) is included in this, so if your application uses a lot of temporary tables this approach will not work out of the box. But for applications without them, it can save a lot of time.
Let's show an example. This assumes you have already set up the system for log archiving, you have a base backup that you have restored, and you have a log archive.
The first thing we do is try to determine the point where a DROP TABLE
happened. We can do this by scanning for entries where rows have been deleted from the pg_class
table, as this will always happen as part of the drop.