FOSDEM is done

That's it - FOSDEM is over for this year. I'm sitting at the airport waiting for my flight home, which should be boarding in about an hour.

I'd say this year was at least as good as last year. The general idea was the same - we had a booth where we sold merchandise and answered questions, and a devroom where we had a full track of talks. We sold slightly less than last year, mainly because we sold almost no t-shirts at all. We ran out of the blue elephants (as usual) and the mugs long before the conference was over. We had a lot of visitors to the booth, so I think we have to declare it a big success. And a great job by all our volunteers who spend the time there instead of browsing what else the conference had to offer. An extra kudos to Gevik who remembered to buy and bring snacks for all the booth people!

The devroom was much larger than last year, which was very good - on the saturday, we filled the room for almost every talk, and there is no way we could've put that many people in the room we had last year. The downside was that it was at the very other end of the conference area, so it was a pretty long walk from the booth (and from the main area of FOSDEM) to the devroom. I think this actually reduced the number of "spontaneous" people we got in there, so if t was closer we would've had even more.

We also had a major FAIL in that the projector didn't work. This was fixed fairly quickly by the FOSDEM staff (thanks!) with a mobile projector, but it left us with a very awkward keynote since we were late and not really ready. We also had a projector with too little light for the room, so we had to have the blinds down all day making the room dark and making it hard to open the windows for ventilation. but in the grand scheme of things, those are trivial issues!

The talk track was very good. My personal favorite (and I know I'm not alone in this) was Simon Riggs talk on replication. Simon's a very good speaker, and the topic is highly interesting. I also liked Heikkis talk about FSM and Visibility Map and Selenas usergroups talk. Well, I realliy liked all talks, but if I have to choose... I think the talk track this year was a clear step up from last year. Even though Greg Stark forgot one of his talks and didn't show up until it was time to take questions.. (sorry, Greg, but you're not getting away with that unpublished)

As for the social track - well, I have yet to be to a PostgreSQL conference (and I consider "our part of FOSDEM" to be a PostgreSQL conference) that did not have a good one, and this was certainly no exception. Great people to hang out with, good food, and good beer - what more can one ask for?

A big thanks to everybody who helped out with tihs conference - both the PostgreSQL specific parts, and the big thing! Already looking forward to coming back next year!

Gearing up for FOSDEM

I'm waiting at the airport to being my travel to this years FOSDEM. Looking at who's coming, and the talks we have lined up for the PostgreSQL/BSD devroom, it looks like it will be at least as good as last year.

So if you're near Brussels, get there! And if you're at FOSDEM, be sure to stop by both our booth and our devroom!

PostgreSQL Prague Developers Day

I've been invited to speak at the PostgreSQL Prague Developers Day. I will be doing one of my "usual" talks about PostgreSQL in a Windows/Active Directory environment, as well as a short keynote around news in 8.4. As far as I know most of the talks will be in Czech, but at least mine and Simons (whom I think is confirmed to go as well) will be in English.

So if you're in or near the Czech Republic, you should be there! And if you're not - here's your excuse to travel to Prague!

What's in a version number?

The other day I came across yet another company (who shall remain nameless, but are certainly not alone) who support using PostgreSQL as a backend for their product. And the support only version 8.2.3. While I prefer using 8.3 on all production servers today, I find it perfectly normal and acceptable that they don't support that, and only 8.2. What is not acceptable to me is only supporting 8.2.3. The 8.2 branch is currently at 8.2.11, and there are several critical bugs - including security issues - patched in these versions. I would not recommend any customer of mine to use 8.2.3 today.

So why is this? The only reason I can think of is that these company simply haven't bothered to figure out how the version numbering of the products actually works. For those who don't know, a quick intro to the PostgreSQL system: Major versions are 8.0, 8.1, 8.2, 8.3 (no, you cannot reasonably claim to support "PostgreSQL version 8", which is also not uncommon, because that includes potentially incompatible future version) Minor versions are 8.2.1, 8.2.2, 8.2.3 etc. Minor versions contain only critical fixes. There are never new features introduced. There are almost never any kind of incompatible changes introduced (I say almost never, because it has* happened when required to plug a security issue)

This should make it fairly obvious that it is counterproductive to support only 8.2.3. Because you are basically forcing the customers not to apply critical security fixes.

The fact that different databases follow different patterns here obviously makes it a little bit harder for the vendor. For example, it would not make sense to say you support MySQL 5.1. Because that can include both beta and release candidate versions. And they do regularly introduce new features and change behavior between versions like 5.1.x and 5.1.y. That's just how their numbering works. I don't personally like that way of numbering releases, but if I was putting out a product saying which versions of MySQL I support, I'd certainly read up on exactly how the version numbers work first.

If you cannot take the time to figure out how the release process of your database product works, why does it matter that you put in the version number at all? It's not likely it's properly supported anyway. That statement should be about "once tested with version x", not actually supported...

And to finish off: what I'd like to see is "we support version 8.2 starting at version 8.2.3" for example. That clearly identifies which major release is supported (8.2). And it also indicates that vendor takes it upon themselves to make sure their product continue working if the customer applies relevant security patches (AKA minor releases).

Ok, that's enough ranting for today...

Todays trivial git tip

If you're like me, you are using the git.postgresql.org repository to do your PostgreSQL development, because it's much nicer to work with than CVS.

If you're also like me, you like git diff with it's nice coloring and trailing-whitespace-warnings and such features. But you're a little bit annoyed that your tabs come out as 8-characters, when the PostgreSQL source uses 4-character tabs, making diffs a bit hard to read. But if you just pipe the output to less or something, the coloring goes away.

I finally got around to looking for a way to fix that today. And it took me all of 2 minutes to find it - I really should've done this before. Put the following in your .git/config file:

[core]
  pager = less -x4

and it'll show you the diffs with 4-space tabs.

Trivial, yes. But it took me this long to even look at fixing it, so hopefully this can help someone...

Last chance to speak at FOSDEM devroom!

The CFP for our FOSDEM devroom ends in a couple of days!

If you haven't submitted your talk yet, now is the time to do so! We still have a few slots open. We are interested in both full-length, half-length and lightning talks!

And if you weren't planning to submit a talk - now is the time to rethink! We want a good mix of different types of talks, ranging from hacker talks to user experiences.

So, head over to the wiki and read up on the details, and submit your talk!

Automatically dropping and creating constraints

I do this fairly often, but after talking to some other people I realized it might be a good idea to share a couple of quick SQL scripts. The idea is: you have some large operations you are doing on your database. Of the kind where you load or update lots of data - which means that it will run a lot faster with the constraints in the database turned off. All of them, or maybe just the FOREIGN KEYs, depending on exactly what you are doing. The following simple queries will generate SQL scripts that drop all your constraints, and then re-create them. Adding filters for just a single table/namespace/constrainttype is trivial, but left as an exercise for the reader...

To generate a script to drop constraints:


SELECT 'ALTER TABLE "'||nspname||'"."'||relname||'" DROP CONSTRAINT "'||conname||'";'
 FROM pg_constraint 
 INNER JOIN pg_class ON conrelid=pg_class.oid 
 INNER JOIN pg_namespace ON pg_namespace.oid=pg_class.relnamespace 
 ORDER BY CASE WHEN contype='f' THEN 0 ELSE 1 END,contype,nspname,relname,conname

Note that the order is important - we must drop FOREIGN KEYs before we drop PRIMARY KEYs and UNIQUE constraints, since they depend on each other.

To generate a script to reload the constraints:


SELECT 'ALTER TABLE "'||nspname||'"."'||relname||'" ADD CONSTRAINT "'||conname||'" '||
   pg_get_constraintdef(pg_constraint.oid)||';'
 FROM pg_constraint
 INNER JOIN pg_class ON conrelid=pg_class.oid
 INNER JOIN pg_namespace ON pg_namespace.oid=pg_class.relnamespace
 ORDER BY CASE WHEN contype='f' THEN 0 ELSE 1 END DESC,contype DESC,nspname DESC,relname DESC,conname DESC;

Again, note the reversed order that is necessary so we create PRIMARY KEYs and UNIQUE constraints before we create the FOREIGN KEYs.

The easiest way to turn these commands into proper scripts is to just use the psql output feature:


mydb=# \t
Showing only tuples.
mydb=# \o drop_constraints.sql
mydb=# SELECT ...
mydb=# \o create_constraints.sql
mydb=# SELECT ...

You can then just load those files as include files in psql. Obviously, you need to script both the DROP and CREATE statements before you actually drop something.

And the second obvious note - always run these things in a transaction. Take advantage of the transactional DDL in PostgreSQL - as always. I often run the whole thing - drop, bulk work, recreat - inside a single transaction. That way, applications may not even need to know there is service happening...

Finally, in some databases you may need to add further restrictions, and always verify the scripts before you run them :)

A Thunderbird extension for the archives

I repeatedly find myself wanting to reference an email in the PostgreSQL mailinglist archives either in a different email or in an IM or IRC conversation. The usual two methods for this have been to either look up the messageid from view-source in Thunderbird, or to go find the message by fulltext search either on our archives or one of the other big sites. I got tired of that, and decided it was good time to make my first try ever at writing a Thunderbird extension. It wasn't particularly hard, though finding the level of newbie documentation I required proved to be a bit of a challenge.

The current version is hardcoded to run on Unix - it does require an /etc/alternatives/x-www-browser. That just means there is room for improvement :-)

Here's a screenshot of how to use it:

http://www.smugmug.com/photos/432794002_w2JdG-X3.jpg border="2"

Should be fairly obvious what to do...

You can get the extension itself at http://www.hagander.net/extensions/

Visibility map arrives

Since Heikki doesn't have a blog of his own, I will take it upon myself to blog about this one: he has just committed the initial version of the visibility map feature for PostgreSQL 8.4.

In short, this means that VACUUM will now only need to visit pages in a table that has actually changed. So if you have a large table where only a very few rows change, VACUUM will require significantly less CPU and I/O to run. With this lower cost, it's also possible to run VACUUM more often on large tables - if nothing at all has changed, it's more or less a no-op!

In the future, the hope is to be able to build on this patch to implement things like index-only scans - but that's for another version.

Yet another great feature for 8.4 - this is shaping up to be a very good release I think.

On the topic of release quality

This post was inspired by Montys post about MySQL release 5.1, and the many discussions it has created both around the web and offline. The question mainly being - why has nobody made such a post about PostgreSQL yet? Is it because it hasn't happened, or just because nobody has posted about it.

We all know that there is no such thing as bug-free software. This obviously includes both PostgreSQL and MySQL, as well as all the commercial competitors - claiming anything else is clearly untrue. But what I find remarkable from Montys post are mainly:

  • MySQL 5.1 has been released with known critical bugs (crash/wrong result). As far as I know, this has never been done with PostgreSQL (at least if we're talking "modern times", since after the product became reasonably stable). And certainly not the number of issues that Monty has listed for MySQL 5.1 - it's not just one or two!
  • There are also critical bugs that were present in 5.0, that still haven't been fixed in 5.1.
  • We already know that MySQL 5.0 was released "too early". We already know that MySQL 5.1 was declared RC "too early". It's remarkable that 5.1 was also released "too early" in that case - and for non-technical reasons again. In theory, this "can't happen" with the PostgreSQL release model, since it's based only on when the features are "ready", not when you need a new release for some other reasons. That's in theory. I know in the past we have tried to schedule releases around certain conferences and such, for better announcement effects. In practice, though, I think this has only led to a release being postponed, never being rushed.
  • MySQL apparantly keep some bug reports hidden from the public (the referenced bug 37936 for example). How's that for open... I approve of keeping them hidden for security bugs only - but if that bug is a security bug, it's clearly taken way too long to fix, given the dates on bugs around it.
  • I still think they've designed their version numbering system to deliberately confuse the customers about what is a beta, what is a release candidate and what is a release. What is so easy with either labeling them as PostgreSQL does (8.3beta, 8.3RC, 8.3.0), or using the in open source popular system of using even-numbered releases for stable releases and odd numbers for beta/testing releases?
  • It took them over a year to get from Release Candidate to release.

Now, there are several posts I found that are questioning Montys post, saying that the quality is just fine - and backing this up with actual experiences in deploying 5.1. I do think both sides are right here - it's perfectly possible to deploy 5.1 without hitting these bugs, as they are "corner-case" issues. But that does not decrease the importance of having releases without known bugs in them. And if there are known bugs, they should at least be listed very clearly in the release notes/announcement. Not doing this is, IMHO, simply irresponsible. Especially least for a database server which is supposed to safeguard all your work...

So how does PostgreSQL measure up

Continue reading

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

PGConf.EU 2017
Oct 24-27, 2017
Warsaw, Poland
PGConf.Asia
Dec 4-6, 2017
Tokyo, Japan

Past

Postgres Open 2017
Sep 6-8, 2017
San Francisco, USA
PGDay.RU
Jul 5-7, 2017
St Petersburg, Russia
PGDay.UK
Jul 4, 2017
London, UK
Amsterdam PUG
Jun 29, 2017
Amsterdam, Netherlands
PGCon 2017
May 23-26, 2017
Ottawa, Canada
More past conferences