Yesterday was the first meeting for the FOSS STHLM "group" - a (very) loose group of FOSS interested people in the Stockholm region. We met in a lecture hall at the Stockholm University in Kista north of Stockholm, for a couple of hours of short presentations. The lineup was very nice: cool embedded stuff? yup, rockbox. General linux? Yup, upstart. Debian-specific? Of course. OpenSource Sweden? yeah. Curl? What else did you expect with Daniel as one of the organizers? PostgreSQL? Yeah, why else would I be blogging this?
Our allocated time was short, so the setup was many short talks. It actually worked a lot better than I thought it would, but it's still very hard to convey something useful in just 20 minutes.
I had a hard time figuring out what I should focus on, so I did a split into two parts (perfect strategy - if you have too little time to do one thing good, do two things in the same time instead...). I started with a section about "things to think about if you're switching to PostgreSQL or trying it out for the first time" - things like the very most basic config parameters that you always have to touch. And of course the classic - "ident authentication failed" issue that hits everybody on RedHat or Debian platforms at least (which is most of the users - definitely in this crowd). It's hard to do much in 12 minutes, hopefully it got some people interested.
I followed that with a very very very short version of "what's coming up in 9.0". Once again, I focused on one of my personal favorite features, which is Exclusion Constraints. While this is often listed as one of the cool things in 9.0, sometimes I feel that too much focus is on streaming replication and hot standby. Don't get me wrong, these are very good and very much needed features. But Exclusion Constraints is a real eye-opener. All databases (including PostgreSQL, of course) have replication - this is "just another way to do it". Yes, a very important and good way to do it, but it's still not something brand new. Exclusion Constraints is something that's fundamentally new. And it's a brilliant example of how PostgreSQL is moving the goalposts forward. Oh, and it's really useful and cool, of course! (and it'll be even better when we have the period datatype, or something similar, in 9.1!)
In summary, I think it was a great event. Big thanks to all those who helped make it happen! Hopefully we can follow it up with many more in similar ways - perhaps mixing these many-short-talks with some more focused discussions on specific areas or technologies. Time will tell...
Update: For those who asked, there were somewhere around 150 people in the sessions.
I forgot completely to close up the feedback system for our talks at FOSDEM and post the feedback to the speakers. My apologies to all speakers and others who were interested! Thankfully, Marc Balmer reminded me today and the individual ratings for different speakers have been sent off to them.
When it comes to the aggregated statistics, they look fairly similar to what we saw at PGDay. The number of speakers were much lower, and unfortunately so were the number of people giving feedback. But in the end, the results were fairly clear. I am especially happy with the level of knowledge in the speakers we attract. Let's start with the graphs:
Breaking down the details, I think this shows a great quality of our speakers:
Name | Speaker Quality | Votes | Deviation Simon Riggs | 4.8 | 6 | 0.4 Magnus Hagander | 4.6 | 5 | 0.5 David Fetter | 4.5 | 4 | 1 Heikki Linnakangas | 4.3 | 9 | 0.9 Dave Page | 4.2 | 10 | 1.0 Marc Balmer | 3.4 | 5 | 1.7 Peter Eisentraut | 3.3 | 7 | 1.0
And the level of their knowledge is even higher:
Name | Speaker Knowledge | Votes | Deviation Simon Riggs | 5 | 6 | 0 Magnus Hagander | 5 | 5 | 0 Heikki Linnakangas | 5 | 9 | 0 David Fetter | 4.8 | 4 | 0.5 Dave Page | 4.3 | 10 | 1.1 Peter Eisentraut | 4.3 | 7 | 1.1 || Marc Balmer || 3.6 || 5 || 1.7 ||
Even at the bottom of this list, the numbers are very good! Unfortunately, the fairly low number of votes give a high deviation at the bottom of the chart.
For next time, we'd really like to get more people to leave their feedback. Does anybody have a good idea on how to make that happen?
Those of you who were at FOSDEM, heard my bold announcement that we will finally get off the ground and hold proper elections fort he board of PostgreSQL Europe, per a decision made the day before. Now, when we got home and read up on the statutes that govern the organization, it turns out we actually can't do that. The way forward that we've found around this, is to actually change the statutes to make these things possible.
To change the statutes, we need a majority of the General Assembly, made out of the active members, to agree on a change. It turns out that given how the statues are formed, there are currently only four members in PostgreSQL Europe, and these four members are the original board. Because of this, it was pretty easy to change the statutes (we only needed 3 of these people to agree, which we all did). And this we have done. The reason for changing them is to make sure we can get a working system going where we have more people involved. But in order to make that work, we first had to make this small-group decision.
The changes have just been published to www.postgresql.eu, and should show up any minute there. The changes that have been made are:
As you can see, the main change is the creation of a membership fee. This is the most controversial change. The idea here is that it will be a nominal fee (€10 for two years of membership - though that exact amount isn't part of the statutes, to keep flexibility). This will make it possible for us to have an actual membership list, which will be the grounds for who can vote.
The current method for handling membership requires the signing and sending around of forms on paper (yes! Actual dead trees!). Given how our organization works, this simply does not work. That is the main driver behind us changing this.
We did evaluate several other ways to deal with membership, ranging from things like requiring physical meetings (not really a good idea for a pan-European organization) to fancy PGP signing schemes (which really is way too complex for a lot of people). Our conclusion was that this was at least the least bad way to do it.
We are going to build on this shortly with more information about exactly how the elections are going to be held. The plan is still to have completed this years election before the summer, and we are well on track to make that goal.
The changes to the English statutes are of course available in our git repository. The binding french changes will be made shortly.
If you have any changes or comments, please direct them at the firstname.lastname@example.org mailinglist, or directly contact the board.
A while back I submitted a couple of talks for PG-East 2010 in Philly, and over the past couple of weeks I've been nagging the organizers semi-frequently to get some pre-info on whether I've been accepted or not, since flight prices started to climb fairly rapidly. The site clearly says information that the information will be available on Feb 15th, so I can't really complain that the answer kept being "don't know yet".
A couple of days ago, I got a note from Dave pinged me with a message asking if I was approved. Turns out this press-release had been posted (by his company, no less). Which explicitly names me as a speaker at the conference.
Took me two more days of chasing down JD, but I now have confirmation I'll be there. I don't actually know what I'll be speaking about, but it's a pretty safe bet it will be PostgreSQL related.
I call this Time management by press releases. If I could only get it to apply to all meetings, I would no longer need to keep my own calendar up to date.
So, I'll see you in Philly!
If you've been to any of my pgcrypto talks, you know I'm a fan of exposing a login interface in the database. It's a great way to allow multiple systems to use the same authentication system, without being limited to a single framework. However, it can often be nice and easy to be able to use the account management system in said framework, because it's already there.
When I go to frameworks, django is my current favorite, and I do like the ability to use it's integrated admin system to edit users. For this reason, I wanted to add the ability to use a database function to authenticate other systems using the django user tables. Thanks to the fact that django choose a nicely readable text format for their hashes, and that PostgreSQL has the awesome pgcrypto library, this is pretty simple. Here's a function that will simply return true if authentication was successful, and false if not.
CREATE OR REPLACE FUNCTION django_login (_username text, _password text) RETURNS boolean AS $$ BEGIN IF EXISTS (SELECT * FROM auth_user WHERE username=_username) THEN RETURN encode(pgcrypto.digest(split_part(password, '$', 2) || _password, 'sha1'), 'hex') = split_part(password, '$', 3) FROM auth_user WHERE username=_username; ELSE RETURN 'f'; END IF; END; $$ LANGUAGE 'plpgsql';
This assumes you have installed pgcrypto in it's own schema, something I always recommend. If you haven't, just remove the schema specifier in the query.
Finally, if you use this type of authentication, remember to use SSL. And don't enable query logging...
Let me first say that I still stand by my previous comments that PostgreSQL is in much less need of a native 64-bit version on Windows than other databases - and less than PostgreSQL on other platforms. (Both this and the previous post talk about server-side support, having support client-side in libpq and ODBC is obviously important)
That said, I am now happy to say that with the effort of several people, it looks like we now have basic 64-bit support for PostgreSQL on Windows. During the past couple of weeks I have cleaned up and extended the patch from Tsutomu Yamada, which in itself builds on the work of several others, and applied it to cvs. As of today, you can build and run PostgreSQL on 64-bit Windows from the official repository.
This is obviously just the beginning. Unfortunately, Windows in general seems to be lagging behind quite severely on the 64-bit front, and this appears to be particularly true of the open source libraries. I've so far been unable to find 64-bit builds of many of the libraries we rely on for full functionality (gettext, zlib, libxml, libxslt, libiconv, tcl, uuid). While in theory support for these things should show up as soon as they are available, that's obviously not tested. If you know where I can download complete 64-bit binaries (including .lib and .h files for development) of any of these projects, please let me know.
Also, this is not very well tested yet. It passes all the PostgreSQL regression tests. That's pretty much all the tests I've been running on it. This is obviously a point where a lot of people can help - so please do. I have made a set of pre-alpha binaries available here. This is a bare-bones build of cvs head earlier today - things like OpenSSL that do have 64-bit binaries available is also disabled. It comes with no installer - just unzip into a directory, and use the initdb and pg_ctl commands directly. So if you have a 64-bit environment, please download and test this against your application!
As you can see, PostgreSQL has just released new updated versions, which include security fixes. They also contain other critical bug fixes, so even if you are not directly affected by the security issues, plan an upgrade as soon as possible.
One of the security issues that have been patched deal with NULL prefixes in SSL certificate names, a vulnerability that is basically the same one that have surfaced in a lot of different products this autumn, for example in the Mozilla suite of products. There is not really space enough to properly discuss the implications this has in a PostgreSQL environment in the release notes, so I'll try to elaborate some here - given that I wrote the fix for it.
First of all, a quick explanation of what the problem is. PostgreSQL uses OpenSSL to deal with certificates. Prior to the fixed version, we just asked OpenSSL for the name of the certificate, got back a string, and used this one. Now, if you know C coding, you know that a string is terminated by a NULL character. The bug in PostgreSQL is that we did not check the return value from this function, and make sure it returned the same value as the length of the returned string. This means that somebody could embed a NULL value in the certificate, and we would incorrectly parse and validate only the part that was before the NULL value. For example, if someone managed to get a certificate with the common name set to "postgresql.bank.com\0attacker.com", PostgreSQL would match this certificate against "postgresql.bank.com" (or "*.bank.com"), which is not correct. With the fix, the certificate will be rejected completely.
It is important to know that in order to make use of this vulnerability, the attacker needs to convince a trusted CA to sign such a certificate - which is quite obviously malicious. If the attacker cannot get the CA to hand this out, PostgreSQL will reject the certificate before we even get this far. It is arguably also a bug in the CA handling (technical or procedural) to even hand out such a certificate, and that bug need to be exploited before the one in PostgreSQL can be.
In the vast majority of cases, if not all, where PostgreSQL is deployed and actually using certificate validation, the certificates will be handed out by a trusted local CA. In which case, exploiting this vulnerability becomes much harder. This scenario is significantly different from the original scenario this bug was discovered in, which is the web browser. In the web browser case, the browser already trusts a large number of external CAs by default. PostgreSQL will trust no CAs by default (unless you are doing a debian install, in which case they put some default CAs in there - this is another reason why this is a really bad idea from a security perspective). PostgreSQL also does not prompt the user with a potentially incorrect name field on the certificate asking if this is ok or not - it will just reject the certificate if it doesn't match (correctly or incorrectly), closing another attack venue. So the bug is really only significant if you can't trust your CA - but the whole point of the CA is that it is a trusted entity...
PostgreSQL 8.4 is the first version to properly support certificate name validation, and also the first version to support client certificate authentication, both of which are vulnerable to this bug, neither of which is enabled by default. However, previous versions are also indirectly vulnerable, because they exposed the CN field of the certificate to the application for further validation. So you could have a stored procedure checking the client certificate, or just the libpq application checking the server certificate, even in earlier versions. And given the API structure, there was no way for these outside processes to know if they were being fooled or not. So if you are using an application that makes use of this on previous versions of PostgreSQL, you still need the patch - there is no way to fix the bug from the application.
The summary of this post is that this vulnerability is a lot less serious in PostgreSQL than in many other systems that had the issue. That doesn't mean it's not there, and that it should be (and have been) fixed. But it means that this vulnerability alone is likely not reason enough to rush an upgrade on your production systems - most likely you're not affected by it. On the PostgreSQL security page it is tagged with classification A, which is the highest. This is more an indication that the system we're using for classification really doesn't take these things into consideration - something we will look into for the future.
If you back up your Slony database with pg_dump, and try to reload it on a different machine (say transfer from a production system to a testing or benchmarking system), you've probably come across this problem more than once.
The dump will include all the Slony objects, with functions and triggers, that you simply cannot reload on a different machine - unless that machine also has Slony installed, and in the same way. A common way to do this is to just do a restore and ignore the errors - but if your database has lots of objects in it, that makes it very hard to spot actual errors - I always prefer to run with -1 and/or -e.
The first step to fix the problem is to exclude the Slony schema when dumping or restoring. That gets rid of most of the problem, but not all. There are still triggers in the main schemas that reference functions in the Slony schema, and now they will fail. Luckily, pg_restore has functionality to generate a table of contents from a dump, and then you can edit this table of contents file to exclude the triggers specifically. If your database isn't too complicated, you can easily script this.
Which brings me to the point of this post. It's actually very simple to script this, as long as the name of your slony schema doesn't conflict with other objects in your database (including the underscore). This is something that I know a lot of people keep doing manually (given the number of questions I hear about it when I say you should always use -e when restoring, for example). So here is a small semi-generic script that will do this for you - sed to the rescue. It simply comments out all the references to your slony schema.
I've finally had the time to summarize the feedback we received from pgday.eu.
We received feedback from about 35 people, which is obviously way less than we were hoping for. Ideas for how to improve this for next time are very welcome! This also means that the figures we have are not very exact - but they should give a general hint about what our attendees thought.
I just sent out the individual session feedback summaries to each individual speaker. These will not be published - it's of course fine for each speaker to publish his own feedback if he wants to, but the conference organizers will not publish the detailed per-session data.
The statistics we do have show that most of our speakers did a very good job, and that the attendees were in general very happy with the sessions. We have also received a fairly large amount of comments - both to the conference and the speakers - which will help us improve specific points for next year!
I'll show a couple of graphs here with the total across all sessions and speakers. In these graphs, 5 is the highest score and 1 is the lowest.
The attendees also seemed to be very happy with our speakers, which is something I'm very happy to hear about. It's also good to see that almost nobody felt the speakers didn't know very well what they were talking about - always a worry with a conference that has so many experienced community people attending.
Actually trying to figure out which speaker is best using this data is very difficult. But here's a list of the top speakers based on speaker quality, who had more than 5 ratings on their talks. The list includes all speakers with an average score of at least 3.5. There are a lot more hovering around that line, but there has to be a cutoff somewhere... Again note that there are still not that many ratings to consider, so values are pretty unstable. I've included the standard deviation as well to make sure this is visible.
Place | Speaker | Score | Stddev | Num 1 | Gavin M. Roy | 4.9 | 0.5 | 18 2 | Guillaume Lelarge | 4.9 | 0.4 | 7 3 | Robert Hodges | 4.8 | 0.4 | 13 4 | Magnus Hagander | 4.8 | 0.4 | 20 5 | Jean-Paul Argudo | 4.8 | 0.5 | 8 6 | Joshua D. Drake | 4.6 | 0.7 | 9 7 | Simon Riggs | 4.6 | 0.6 | 17 8 | Dimitri Fontaine | 4.5 | 0.5 | 14 9 | Greg Stark | 4.3 | 0.5 | 8 10 | Vincent Moreau | 4.1 | 0.6 | 8 11 | Mark Cave-Ayland | 4.0 | 0.6 | 11 12 | David Fetter | 3.9 | 1.1 | 9 13 | Gabriele Bartolini | 3.7 | 1.0 | 15 14 | Heikki Linnakangas | 3.6 | 0.7 | 9
All of these are clearly very good numbers.
So once again, a big thanks to our speakers for their good work. And also a very big thanks to those who did fill out the session feedback forms - your input is very valuable!
Update: Yes, these graphs were made with a python script calling the Google Charts API. Does anybody know of a native python library that will generate goodlooking charts without having to call a remote web service?
I've finally gotten around to uploading my pictures from PGDay.EU 2009 to [ my smugmug gallery].
Clearly the conference was tiring, and we all needed a rest... (yes, this was during JD's talk)
And as the picture says, don't forget to submit your feedback - the site is still open for that!