Magnus Hagander's blog

On the topic of release quality

Posted on Dec 1, 2008 at 13:54. Tags: mysql, postgresql.

This post was inspired by Montys post about MySQL release 5.1, and the many discussions it has created both around the web and offline. The question mainly being - why has nobody made such a post about PostgreSQL yet? Is it because it hasn't happened, or just because nobody has posted about it.

We all know that there is no such thing as bug-free software. This obviously includes both PostgreSQL and MySQL, as well as all the commercial competitors - claiming anything else is clearly untrue. But what I find remarkable from Montys post are mainly:

MySQL 5.1 has been released with known critical bugs (crash/wrong result). As far as I know, this has never been done with PostgreSQL (at least if we're talking "modern times", since after the product became reasonably stable). And certainly not the number of issues that Monty has listed for MySQL 5.1 - it's not just one or two!
There are also critical bugs that were present in 5.0, that still haven't been fixed in 5.1.
We already know that MySQL 5.0 was released "too early". We already know that MySQL 5.1 was declared RC "too early". It's remarkable that 5.1 was also released "too early" in that case - and for non-technical reasons again. In theory, this "can't happen" with the PostgreSQL release model, since it's based only on when the features are "ready", not when you need a new release for some other reasons. That's in theory. I know in the past we have tried to schedule releases around certain conferences and such, for better announcement effects. In practice, though, I think this has only led to a release being postponed, never being rushed.
MySQL apparantly keep some bug reports hidden from the public (the referenced bug 37936 for example). How's that for open... I approve of keeping them hidden for security bugs only - but if that bug is a security bug, it's clearly taken way too long to fix, given the dates on bugs around it.
I still think they've designed their version numbering system to deliberately confuse the customers about what is a beta, what is a release candidate and what is a release. What is so easy with either labeling them as PostgreSQL does (8.3beta, 8.3RC, 8.3.0), or using the in open source popular system of using even-numbered releases for stable releases and odd numbers for beta/testing releases?
It took them over a year to get from Release Candidate to release.

Now, there are several posts I found that are questioning Montys post, saying that the quality is just fine - and backing this up with actual experiences in deploying 5.1. I do think both sides are right here - it's perfectly possible to deploy 5.1 without hitting these bugs, as they are "corner-case" issues. But that does not decrease the importance of having releases without known bugs in them. And if there are known bugs, they should at least be listed very clearly in the release notes/announcement. Not doing this is, IMHO, simply irresponsible. Especially least for a database server which is supposed to safeguard all your work...

So how does PostgreSQL measure up

New planet administration interface

Posted on Nov 18, 2008 at 13:54. Tags: postgresql.

As of a couple of minutes ago, it is now possible for people who have their blogs aggregated on Planet PostgreSQL to administer their registration information online - no more need to send an email to us every time (but you can still do that if you want - planet(at)postgresql.org is the address to use!). There's a link at the bottom righthand side of the frontpage of Planet PostgreSQL to get there, and it uses your pre-existing community account to log in.

The functionality so far is fairly limited, but it's currently possible to: * Register a new blog for aggregation (needs approval) * Remove a blog from aggregation * Delete individual posts (will cause them to reload if they're still in the RSS feed) * Hide individual posts (that way they won't get reloaded) * Registered users will automatically get added to (and removed from) a mailinglist to receive announcements, discuss policy etc

That's it for now. We're happy to hear more suggestions for things to add to this interface.

When installing this, I have mapped all users I could to their existing community accounts. However, there are a number of blogs I failed to map either because they don't have an account, or because I couldn't figure out which one it is. I would ask these people to either go into the administration interface and register an account (you can attach an existing blog there - will need approval before it goes through) or just email me your community login (if you don't have one yet, please sign up for one). This way you will end up on the mailinglist for announce messages. The following blogs are currently without a userid: * Aurynn Shaw * Benjamin Reed * Chris Smith * Christopher Kings-Lynne * Dave Cramer * Enver Altn * Frank Wiles * Gavin M. Roy * Gavin Sherry * Ian Barwick * Jon Jensen * Kenneth Downs * Kenny Gorman * Leif B. Kristensen * Liam O'Duibhir * Ow Mun Heng * Paul Silveria * Robert Hodges * Robert Lor * Satoshi Nagayasu * Tom Copeland * Usama Munir Dar

I realize a number of these blogs are currently broken, due to the crash of people.planetpostgresql.org. If your blog was there, and you want a new one there, contact [mailto:devrim@commandprompt.com Devrim]. Otherwise, if you want to move it to another hosting like Blogger or Wordpress, like a number of people have done, just let us know and we'll update the address of your subscription (but we still want your community login!). Finally, if you want to blog removed instead of updated, send us an email (in this case you don't need to sign up for a community account, of course).

The interface is fairly ugly at this time - someone promised to work on those templates though, so stay tuned for something prettier...

Recovering blog data

Posted on Nov 17, 2008 at 15:53. Tags: postgresql.

Per indications from Devrim, I have given up on getting my blog entries back from the old planet machine. If they do show up, that would be a happy surprise, but I now consider it a very remote chance that it will.

Instead, I have now recovered some of my blog posts using things like google cache, the wayback machine and such methods. I will be cleaning up the old posts and turning them visible one by one over the next couple of days... If you happen to have some of my older posts saved away somewhere, please let me know :) And if you spot formatting errors in something you find, then let me know that as well.

Finally, if you're coming to this post from a redirect. Sorry, all the old URLs are "gone", and I don't know how to get a redirect for them. Please use the archive to browse to the post you were looking for.

Update: A bit quicker than I thought, my script did a fairly good job. I've turned everything I managed to recover visible by now...

PostgreSQL Europe website finally up

Posted on Nov 17, 2008 at 13:55. Tags: postgresql.

The website for PostgreSQL Europe is finally up!

It's long overdue - we really should've had this up before the summer. But better late than never.

And please - your contributions towards it are much appreciated! Right now it's all static content handled through a django template framework, and adding such content is as simple as adding a static HTML file. So if you're interested in contributing content, please look it over and see what you can do (the whole source is of course available in our git repository). And if you're not comfortable in HTML - just send us the text you think should go on there and we'll find someone to do the markup!

PostgreSQL SSL code updates

Posted on Nov 13, 2008 at 09:47. Tags: postgresql.

I am currently working on several updates to the POstgreSQL SSL code, to make it more secure and add some functionality. I'd be interested to hear from people who are either using this today, or are interested in using the new functionality - there is still room to make further adjustments to the code before the release.

Certificate validation in libpq

This patch was applied today. The idea is to be able to control how the certificate validation is done in libpq. Previously, libpq would verify the server certificate if a root certificate file was found, and otherwise never do it. This made the system very fragile. And it would never attempt to verify that the certificate actually matched the server.

With this patch, there is now a new connection parameter sslverify, that controls this behavior. It's all controlled by this parameter, and never by just checking the existance of a file. It can have the following values: : cn : Default. Verify that the certificate chains to a trusted root, and that the server name matches. : cert : Verify that the certificate chains to a trusted root, but ignore the name. : none : Disable certificate verification completely.

The version that is committed does not support subject alternate names or wildcard certificates. It's something I am hoping to have the time to add before the release. Feel free to send me a patch ;-)

Requiring a client certificate

This patch is currently pending review in this commitfest. The idea here is to move from having the requesting of client certificate to be controlled by if the root certificate file exists or not, to it being an explicit configuration variable. This makes it much more secure against "admin mistakes" - explicit configuration is always better when it comes to security.

This patch builds on the changes to the pg_hba.conf file, and therefor just adds a connection option to the hostssl rows (obviously you can only require client certificates on SSL connections). Set it to 1 to require client certificates. Of course, it also needs the root certificate file to be present.

Having this in pg_hba.conf also makes it possible to configure this value differently depending on which addresses your client are connecting from, if required.

Client certificate authentication

This patch is pending some final cleanups before I post it. The idea here is, obviously, to be able to use your SSL client certificate to perform the actual authentication, thus doing away with the need to have a password as well. Given that our client certificate code already supports for example smartcards (through OpenSSL), this can be a high security option for remote logins. I'm sure there are other usecases as well - it's a feature that have been asked for more than once.

I plan to make this code just use the cn attribute of the certificate to authenticate. This can then be passed through a pg_ident.conf map to map to "real" username, in case the syntax is not identical. In a lot of cases it can probably be very useful to combine this with regexp entries in the ident maps which is another patch that's in the queue for this commitfest.

One thing I'm unsure about here is - will it be enough to be able to use the cn attribute for authentication, or will it be required to use other attributes as well? How do the enterprise PKI solutions that you'd use this together with work?

Recovering from planetary disaster

Posted on Oct 28, 2008 at 09:29. Tags: postgresql.

So, as Devrim has already posted, there was a major disaster with Planet PostgreSQL a while ago. The result was that both the aggregator (www.planetpostgresql.org) and the blog-hosting-for-many-PostgreSQL-community-people-including-me (people.planetpostgresql.org) went down. This was not so good, but it happens. Also, there were no backups. This is a lot worse. This is a resource with a lot of high-value information, and it's now been offline for a long time. We still do not know exactly what happened, but Devrim has now indicated that we may be able to recover the data somehow at some point, but we don't know when - hopefully soon.

There were two parts to this:

The aggregator

The aggregator, Planet PostgreSQL, contained no actual data (that's in it's nature) other than the list of blogs it was pulling from. And since we had already been experimenting with some new software running on a community server to do this, we could rapidly bring this server and software into production when we realized this issue wouldn't be resolved quickly. Moving the planet over to a community managed server was discussed and agreed on a long time ago, but I was too lazy to finish off the last pieces of the software. This was now done in a hurry, during pgday.eu, to get something up. Since we could not reach Devrim (his email was also on the server that was down), we set up http://planet.postgresql.org in the official postgresql.org namespace to point to this server. When we got hold of Devrim, he also changed www.planetpostgresql.org to point to this new, community managed, planet.

The day after this, when Devrim had a few more things under control, he came back to us saying that he was not comfortable having Planet PostgreSQL under community control, co-managed by him and the rest of the team that manages our infrastructure. At this point we pushed for the point that had been made a long time ago - the web team is not comfortable having such an important service with such a prominent location on www.postgresql.org not managed by the community team (with Devrim still being the head maintainer, just with the rest of the team as backup in case something happened - and of course with the standard community requirements on backups etc). Devrim's choice in this case was to repoint the planetpostgresql.org domain to his server (even though it at the time had nothing on it - though he did get the aggregator back up not too long after that), and ask us to remove it from the front page of the website if we would not accept that. This is when the decision was made to keep http://planet.postgresql.org running as a community managed service and as the official PostgreSQL blog aggregator service that is linked from the main website.

The conclusion of this, was a fork of the aggregator service. There is now the PostgreSQL community official aggregator, at http://planet.postgresql.org, and there is Devrim's aggregator at http://www.planetpostgresql.org. They both provide similar service to the end user, through different software and different policies. Only the first one feeds to www.postgresql.org.

This has exactly nothing to do with the blog hosting, this only deals with the aggregator.

The blog hosting

The blog-hosting service at people.planetpostgresql.org is the one that contained all the data. This is the part that we are still hoping we will be able to recover some data from. This is a second service provided by Devrim, that is unrelated to the aggregator - other than that they were running on the same, crashed box.

There are no plans by the PostgreSQL web/infrastructure team to provide this service. There are a lot of services out there on the net that provide blog hosting, Devrim's included (once he gets the system back online). Both commercial and free. The aggregation service will be equally happy to work with both. So if you are looking to set up a PostgreSQL blog, either talk to Devrim or look at one of the external offerings.

I've personally decided to move my blog to my own hosting. It's now available at http://blog.hagander.net. I will try to recover the old data as soon as Devrim makes it available either into this blog, or into the old location, depending on what's possible. I know others, for example Robert have done the same. AFAIK, we were both considering this beforehand as well but found the existing service convenient. The feeds have been updated on the main planet site, but if you were using the direct feeds, you need to update the link (see sidebar for feed links). And a big thanks to Devrim for hosting my blog there as long as he did.

I give no recommendations to other people who had their blogs on people.planetpostgresql.org about what to do with their blogs, and there will be no statement from the web or infrastructure team about it. It's an unrelated service, that everybody needs to decide on their own about.

The conclusion of this part is that my blog now lives at a new URL. Update your links. Sorry for the inconvenience.

Lightning talk @west

Posted on Oct 12, 2008 at 23:00. Tags: conferences, postgresql.

align="right"Selena tricked me into doing a Lightning Talk here at west today. We almost missed it because lunch dragged out (oops), but we made it just in time. My talk was titled "Creating a debian compatible random number generator in 5 simple slides", and just to make JD happy I have to post the final summary slide here. There needs to be one from each conference... Currently in Jeff Davies talk about streaming queries, I'll probably write up a more complete summary of the conference later on. Should pay attention now...

Npgsql 1.0

Posted on Oct 11, 2008 at 22:30. Tags: postgresql.

Good news today - Npgsql 1.0 final has been released. It's been a long wait, but the 0.x and beta versions have certainly been very stable. But I'm looking forward to upgrading my systems to 1.0 soon. Great job Francisco and his helpers.

OpenSource Days - roundup

Posted on Oct 5, 2008 at 23:00. Tags: conferences, postgresql.

I got back from OpenSource Days in Copenhagen yesterday, after two and a half fairly intense days. As usual (while up until last year the conference was named LinuxForum, it's still the same conference) the conference itself was great. Lots of very good talks to listen to, and very nice arrangements for us speakers. And a whole lot of interesting people to talk to.

It was the first time I've been both manning a "commericial booth" (for Redpill Linpro) and been a speaker/participant at the same time. I think it worked reasonably well - though my booth colleagues might think differently due to my absence from the booth particularly on the Saturday. In my talk, I specifically tried to avoid mixing in our company services (unlike some other speakers, who shall remain nameless..), because I was there to talk about PostgreSQL. I think that also worked out fairly well.

My own talk went pretty well - got some interesting discussion going afterwards, along with a couple of suggestions for making it better next time. It's nice with an audience that's involved enough to come with those. There are no speaker eval forms at the conference, but I got the impression it was fairly well received.

As a result of the talk, which had a section about how to use pgcrypto to build a secure authentication system, several people asked me what can be done about getting pgcrypto out of contrib, to make it "safer" to use this in a production application. Given the number of people who mentioned it, it's pretty clear to me that we need to do something about this.

Speaking of things that were mentioned a lot - several people asked me during the conference about the state of the CTE-patch for PostgreSQL 8.4. Unfortunately I couldn't say much more than "probably" at the time. Since then, Tom Lane has committed the patch. So for those of you who asked then, and don't follow the list - the answer has now changed from "probably" to "yes".

Obviously, I listened to Jan's keynote talk about Slony. While i did not learn anything new about Slony, Jan did a very good job of explaining some of the more advanced things Slony is capable of doing, which is the reason it's fairly complex to configure. Good talk!

I'd also like to second what Troels writes in his blog - Jan did a good job of not hiding the weaknesses with Slony. Which is something that non-open(source) vendors have a tendency not to be. (And I'll venture as far as to say that there were certainly other speakers at this conference who were not so forthcoming - hopefully myself excluded, but I'll leave it to others to judge that)

I'll certainly be back next year!

The worlds smallest Slony cluster?

Posted on Sep 29, 2008 at 23:00. Tags: postgresql.

We recently updated on of our Slony clusters. I think it at least used to qualify as one of the worlds smallest ones:

Two nodes (obvious minimum)
one database (that's a given)
with one table
with one column
with one row

Now, we recently doubled the size of this cluster. It now has a whopping two columns in the table. The second column being updated by a trigger, so it's still only one column updated by the end user (well, application). But it's two columns to be replicated!

So what does this prove? Really, not much, but at least: * Slony certainly scales "downward" just fine. It feels like a bit overhead to set it up for something like this, but it works just fine. * Even in a small database, triggers can be very useful - regardless of what the documentation of a certain other database used to say before... * And even in a trivial case like this, statement based replication simply does not work reliably. You need something that's data based - something that the same other database is actually recognizing now and will be including the next version...