Why are you not logging your DDL?

Last week I had yet another customer issue where "someone" had been issuing DDL statements in the database. And nobody knew who. Or why. But (surprise!) it broke things (and they weren't even running Slony!). There are two simple lessons to be learned from this:

In a production environment, arbitrary DDL statements are normally not run. If they are, you really need to look over your application design, because it's broken. Note that this does not include temporary tables. Also things like automating the creation of new partitions are also pretty normal. But the important thing there is that it's controlled and scheduled work, not arbitrary statements.

So, you'll want to keep track of your DDL. PostgreSQL provides a very simple and good way to do this. Set the configuration parameter log_statement='ddl'. The default value for this parameter is none, and there are also options for logging all DML and all statements period. But for a production environment, I find the ddl option to be very useful. So useful, in fact, that I'd consider it an installation bug in most environments if it's not set. So if this parameter is not set in your production environment, now is a good time to reconsider that decision.

The second thing to learn comes from the fact that once we tracked it down, it turned out that the DDL was issued from the application server. Which was running with superuser privileges. Now that's a much larger bug in the deployment, and a failure waiting to happen. There's a very simple lesson to learn from this: the application server should never run with superuser privileges. It should also not run with a user that has permissions to issue any DDL. This is simply the principle of least privilege - or at least principle of not insanely high privileges.

Yes, there are a number of application servers and frameworks that issue their own DDL as part of their ORM. The best way to handle them is, IMHO, to have them generate the SQL output and then manually apply that using a high privilege account. Because DDL should only be issued as part of upgrades and similar things, this should not be an issue. If the application server does not support this, a workaround is to give the application server DDL permissions during the upgrade only, and then take them away as soon as the upgrade is completed.

And yes, you should do this on your developer systems as well, and not just in production. Because if you only do it in production, you won't notice your bugs until you have deployed. It may seem like a lot of extra work to begin with, but it really is only a little extra work once you have got the procedures in place. And it can save you a lot of forensics work once something has happened.

What's your favorite 8.4 feature?

Over the past month or so, I have been informally polling those of our customers I've been meeting with about which features in the upcoming 8.4 version of PostgreSQL they are most excited about, and most likely to have use for in the short-to-medium term. Part of the results were exactly what I expected, other parts were a bit more surprising.

First of all, the one feature that every single one said they were very much looking forward to is the updated Free Space Map implementation, which will remove two very annoying and difficult-to-get-right configuration parameters (max_fsm_pages and max_fsm_relations). So kudos to Heikki for that. Closely related to this feature is the Visibility Map, and it's reduction of VACUUM requirements. This is, however, not something a lot of our customers will have a direct use for - other than one or two, they don't actually have any issues with VACUUM (once they learned to remove VACUUM FULL from the nightly cronjobs).

When looking closer at the SQL level functionality that people have been excited above, I expected to see Common Table Expressions (CTEs), AKA Recursive Queries at the top of the list, based on requests on our lists in the future. However, it seems that at least to our customers, Window Aggregates are a lot more interesting than CTEs. I'm sure there are a lot of use-cases for recursive queries - there just seem to be even more of them for window aggregates.

Parallel restoring of dumps is also fairly high on peoples list, though most of the people that can really use it are deploying different ways to deal with it already. But they will definitely be using it once it's there. As well as the speedups of the PITR recovery and warm standby.

There are a lot of other features in the upcoming version as well. Some are small, some are large. There are still some smaller features (for example \ef in psql) that I keep missing a lot every time I end up working on an earlier version. They're all (well, most of them) great, but maybe not so obvious at first.

What's your favorite 8.4 feature?

Updates to the postgresql.org git service

As Peter has already noted, we have been planning for a while to update the service on git.postgresql.org, and as of now we just flipped the switch to the new version. We are still waiting for the DNS zone to update (yes, we could've lowered the TTL. No, we didn't think it was that important). So within 5 hours, anybody accessing git.postgresql.org through http, git or ssh will be accessing the new server.

First a note to all those people who use the git clone of the main cvs repository to do PostgreSQL development: nothing has changed for you. The URL is the same, the repository is the same, just keep pulling.

The only real difference there should be that the new server the service is hosted on has a little bit more bandwidth available, but it should not normally be noticeable.

The changes are all around the repositories that are originally hosted at the site. The changes have been made to make it easier to collaborate on projects, and to require less manual work to set things up. It is our hope that this will lower the bar of entry for those who are interested in developing PostgreSQL-related projects using git. The main changes for these users are: Login is now integrated with the postgresql.org community account system. This means unified usernames across PostgreSQL services, such as for example the wiki. You just need to upload your SSH key to enable access. Users no longer have shell accounts on the server. Instead, all users use git@git.postgresql.org as their ssh login. This makes the management and securing of the server much easier. It is now possible for the owner of a repository to delegate permissions to other users directly from a web interface. As long as the other user has also uploaded his ssh key, this will be completely automatic. It is now possible to request a new repository for your code using the same web interface. Requests for new repositories will still have to be approved before the repository is created.

Per an inventory that was made before the switch was made, several inactive repositories were not migrated to the new server. If you are missing a repository that was for some reason not migrated, feel free to contact us at [mailto:pgsql-www@postgresql.org the pgsql-www mailinglist] for a recovery - we will keep all the old files around for a while just in case.

As a final note, the source code for the git management tool itself is of course available in the git repository (when your DNS has updated).

PostgreSQL security updates are out

PostgreSQL 8.3.7, 8.2.13, 8.1.17, 8.0.21 and 7.4.25 have been released and are available for download from www.postgresql.org. They contain the fix for a low-risk security issue as well as several other minor updates. All users are advised to upgrade when possible.

Continue reading

Summary: Prague PostgreSQL Developers Day

It's been a couple of days since I got back from Prague and the local PostgreSQL Developer Day held there last week. The conference itself was on Thursday, and I arrived Wednesday evening. David Fetter had already arrived directly from FOSDEM in Brussels, but I had to go home and do some actual work in between. I met up with the group in the restaurant/bar at the bottom floor of our hotel - which happens to be an old brewery.

The conference itself started with a quick welcome from the conference organizers, and then I delivered an even shorter version of my What's new in PostgreSQL 8.4 talk from FOSDEM as the keynote. This was followed by Kristo Kaiv talking about PostgreSQL at Skype - happily enough for me (and David I assume) in English.

After Kristo, I gave a repeat of my Deploying PostgreSQL in a Windows Enterprise talk. It's a hard talk to give to an audience like this - it's pretty clear that the large majority were users of Linux/BSD/Solaris and didn't care much for windows. On the other hand, I noticed several people who were very interested in the integration aspects, so I think it was fairly successful anyway.

This was followed by a number of talks in Czech, which I will freely admit to not understanding much more than scattered words like "fsm_pages", "recovery" and so on. Simon Riggs arrived during these talks, and gave his Replication, Replication, Replication talk (the same talk he gave at FOSDEM with some minor modifications/additions). After this there was a nother couple of czech talks, and some interesting hallway discussions.

After the conference, the organizers held an auction of the blue elephants (small size - David brought three from FOSDEM) and a couple of the stressballs. Amazingly, they brought in about €60 for one of the elephants (€25 being the base price) and an impressive €15 for a stressball (sure, signed by me and David Fetter, but the base price for that one is €1..). Finally they put a nice bottle of wine in the hands of each speaker as a speaker gift - unfortunately I wasn't able to bring mine home on the plane due to liquid regulations for traveling without checked luggage.

In the evening, we met back at the same restaurant for dinner and beer. After dinner, the Czechoslovakian PostgreSQL user group was formally formed! Clearly this process was also in Czech, so I can't comment much on that - other than it's good to see the user group formally formed!

That concludes the conferences in February for me, unless something unexpected come up. Thanks to all the people involved in organizing these conferences, the social activities around them, and generally in making sure a good time was had!

FOSDEM is done

That's it - FOSDEM is over for this year. I'm sitting at the airport waiting for my flight home, which should be boarding in about an hour.

I'd say this year was at least as good as last year. The general idea was the same - we had a booth where we sold merchandise and answered questions, and a devroom where we had a full track of talks. We sold slightly less than last year, mainly because we sold almost no t-shirts at all. We ran out of the blue elephants (as usual) and the mugs long before the conference was over. We had a lot of visitors to the booth, so I think we have to declare it a big success. And a great job by all our volunteers who spend the time there instead of browsing what else the conference had to offer. An extra kudos to Gevik who remembered to buy and bring snacks for all the booth people!

The devroom was much larger than last year, which was very good - on the saturday, we filled the room for almost every talk, and there is no way we could've put that many people in the room we had last year. The downside was that it was at the very other end of the conference area, so it was a pretty long walk from the booth (and from the main area of FOSDEM) to the devroom. I think this actually reduced the number of "spontaneous" people we got in there, so if t was closer we would've had even more.

We also had a major FAIL in that the projector didn't work. This was fixed fairly quickly by the FOSDEM staff (thanks!) with a mobile projector, but it left us with a very awkward keynote since we were late and not really ready. We also had a projector with too little light for the room, so we had to have the blinds down all day making the room dark and making it hard to open the windows for ventilation. but in the grand scheme of things, those are trivial issues!

The talk track was very good. My personal favorite (and I know I'm not alone in this) was Simon Riggs talk on replication. Simon's a very good speaker, and the topic is highly interesting. I also liked Heikkis talk about FSM and Visibility Map and Selenas usergroups talk. Well, I realliy liked all talks, but if I have to choose... I think the talk track this year was a clear step up from last year. Even though Greg Stark forgot one of his talks and didn't show up until it was time to take questions.. (sorry, Greg, but you're not getting away with that unpublished)

As for the social track - well, I have yet to be to a PostgreSQL conference (and I consider "our part of FOSDEM" to be a PostgreSQL conference) that did not have a good one, and this was certainly no exception. Great people to hang out with, good food, and good beer - what more can one ask for?

A big thanks to everybody who helped out with tihs conference - both the PostgreSQL specific parts, and the big thing! Already looking forward to coming back next year!

Gearing up for FOSDEM

I'm waiting at the airport to being my travel to this years FOSDEM. Looking at who's coming, and the talks we have lined up for the PostgreSQL/BSD devroom, it looks like it will be at least as good as last year.

So if you're near Brussels, get there! And if you're at FOSDEM, be sure to stop by both our booth and our devroom!

PostgreSQL Prague Developers Day

I've been invited to speak at the PostgreSQL Prague Developers Day. I will be doing one of my "usual" talks about PostgreSQL in a Windows/Active Directory environment, as well as a short keynote around news in 8.4. As far as I know most of the talks will be in Czech, but at least mine and Simons (whom I think is confirmed to go as well) will be in English.

So if you're in or near the Czech Republic, you should be there! And if you're not - here's your excuse to travel to Prague!

What's in a version number?

The other day I came across yet another company (who shall remain nameless, but are certainly not alone) who support using PostgreSQL as a backend for their product. And the support only version 8.2.3. While I prefer using 8.3 on all production servers today, I find it perfectly normal and acceptable that they don't support that, and only 8.2. What is not acceptable to me is only supporting 8.2.3. The 8.2 branch is currently at 8.2.11, and there are several critical bugs - including security issues - patched in these versions. I would not recommend any customer of mine to use 8.2.3 today.

So why is this? The only reason I can think of is that these company simply haven't bothered to figure out how the version numbering of the products actually works. For those who don't know, a quick intro to the PostgreSQL system: Major versions are 8.0, 8.1, 8.2, 8.3 (no, you cannot reasonably claim to support "PostgreSQL version 8", which is also not uncommon, because that includes potentially incompatible future version) Minor versions are 8.2.1, 8.2.2, 8.2.3 etc. Minor versions contain only critical fixes. There are never new features introduced. There are almost never any kind of incompatible changes introduced (I say almost never, because it has* happened when required to plug a security issue)

This should make it fairly obvious that it is counterproductive to support only 8.2.3. Because you are basically forcing the customers not to apply critical security fixes.

The fact that different databases follow different patterns here obviously makes it a little bit harder for the vendor. For example, it would not make sense to say you support MySQL 5.1. Because that can include both beta and release candidate versions. And they do regularly introduce new features and change behavior between versions like 5.1.x and 5.1.y. That's just how their numbering works. I don't personally like that way of numbering releases, but if I was putting out a product saying which versions of MySQL I support, I'd certainly read up on exactly how the version numbers work first.

If you cannot take the time to figure out how the release process of your database product works, why does it matter that you put in the version number at all? It's not likely it's properly supported anyway. That statement should be about "once tested with version x", not actually supported...

And to finish off: what I'd like to see is "we support version 8.2 starting at version 8.2.3" for example. That clearly identifies which major release is supported (8.2). And it also indicates that vendor takes it upon themselves to make sure their product continue working if the customer applies relevant security patches (AKA minor releases).

Ok, that's enough ranting for today...

Todays trivial git tip

If you're like me, you are using the git.postgresql.org repository to do your PostgreSQL development, because it's much nicer to work with than CVS.

If you're also like me, you like git diff with it's nice coloring and trailing-whitespace-warnings and such features. But you're a little bit annoyed that your tabs come out as 8-characters, when the PostgreSQL source uses 4-character tabs, making diffs a bit hard to read. But if you just pipe the output to less or something, the coloring goes away.

I finally got around to looking for a way to fix that today. And it took me all of 2 minutes to find it - I really should've done this before. Put the following in your .git/config file:

[core]
  pager = less -x4

and it'll show you the diffs with 4-space tabs.

Trivial, yes. But it took me this long to even look at fixing it, so hopefully this can help someone...

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

FOSDEM+PGDay 2019
Feb 1-3, 2019
Brussels, Belgium
Nordic PGDay 2019
Mar 19, 2019
Copenhagen, Denmark

Past

PGConf.Asia 2018
Dec 10-12, 2018
Tokyo, Japan
DC PostgreSQL Users Group
Nov 14, 2018
Washington DC, USA
New York City PostgreSQL User Group
Nov 13, 2018
New York City, NY, USA
Driving IT 2018
Nov 2, 2018
Copenhagen, Denmark
PGConf.EU 2018
Oct 23-26, 2018
Lisbon, Portugal
More past conferences