Remote log reading in PostgreSQL 9.1

PostgreSQL 9.1 beta1 now available - now is a great time to start testing it, and trying out all the great new features.

There have always been a number of ways to read your PostgreSQL logs remotely, over a libpq connection. For example, you can use the pg_read_file() function - which is what pgadmin does. PostgreSQL 9.1 adds a new and more convenient way (in some ways) to do this - using SQL/MED.

PostgreSQL 9.1 comes with SQL standard SQL/MED functionality. The MED in is short for "Managemend of External Data", and as the name sounds, it's about accessing data that's external to the PostgreSQL server. The SQL/MED functionality is not (yet) complete, but it's already very useful in it's current state.

In SQL/MED, there is something called a Foreign Data Wrapper, that can be compared to a driver. Using this FDW, we can create one or more Foreign Servers, which is a definition of how to connect to a specific instance of the service - if any. Finally, we can create one or more Foreign Tables on each of the Foreign Servers, giving us direct access to the remote data using SQL.

Continue reading

Joining the PostgreSQL Core Team

As has just been announced here, I was recently invited to join pgsql-core, and have accepted.

I guess the guys currently on it finally got tired of all my complaints, and figured out the way to make me stop was to suck me into the organization. The future will tell if their strategy will be successful or not...

For those who don't know (this hopefully doesn't include my readers on Planet PostgreSQL), pgsql-core is the "steering committee" for the PostgreSQL project. Exactly what they do seem to be somewhat up for debate both outside and inside of the group itself, but it at least has something to do with the leadership of the project...

Anyway, I'd like to thank the guys in the group for showing this trust in me, and shall do my best not to screw it up!

PGConf.EU 2011 will be held in Amsterdam in October

It's time to mark your calendars: PostgreSQL Conference Europe 2011 (formerly known as PGDay.EU) will be held on October 18-21 at the Casa400 Hotel in Amsterdam, The Netherlands.

Like last year, the conference will be held in a hotel venue, combining both the conference rooms and guest rooms, so you don't have to waste any time finding your way around the city. As in previous years, the conference will include full catered coffee breaks and lunches, to make the most of the time. The first day of the conference will be a training day, and the following three days will be regular conference tracks. The conference will accept talks in English, Dutch, German and French, to benefit those attendees who prefer talks in their native language.

We are just starting our search for sponsors - if you are interested in sponsoring the conference, or know someone who is, please take a look at our sponsorship opportunities and don't hesitate to contact us if you have any questions or would like to propose an alternative arrangement.

We will also follow up with a call for papers later, and in due course open for registration and post a conference schedule. For now, mark the dates, and follow the news on our website and on our twitter stream @pgconfeu.

Training at the increasingly misnamed PgEast

Next week it's time for PgEast: 2011, this time in New York City.

I've already outlined why the East part of "PostgreSQL Conference East" (as it was called at the time) is incorrect: as is obvious to anybody with a basic knowledge of geography, the conference is to the west. From what I can tell, it's approximately 74 degrees west of zero, which means it's more than 20%25 of the world to the west.

In expanding this scope, it seems JD has this year decided to get the rest of the name wrong as well, in a bid to get more people. Just like it's 20%25 of the world wrong in location, it's no longer a PostgreSQL conference. Instead it's more of a cross-database conference, with an entire track dedicated to MongoDB (incidentally, approximately 20%25 of the tracks, it seems). Is that bad? Absolutely not - I'm looking forward to sneaking in on one or two of those MongoDB talks. But I think it means we have to go back to the proper name for the conference - JDCon-East!

And I'm sorry JD, but whatever numbers you get, you will not be the biggest PostgreSQL conference around. We are going to have to leave that title where it belongs - with the Brazilians (for now).

This year, the conference is also running a full 7 parallel training sessions the day before the actual conference. As part of this, I'm giving a half-day training on Streaming Replication and Hot Standby. If you haven't registered for it already, there are still seats open! And tell your friends - since this is how my trip there gets funded, I'd really like to get a full session...

I will also be giving a talk during the regular conference, Data Driven Cache Invalidation.

There's plenty of PostgreSQL - and MongoDB - around for everybody at this conference, so if you're anywhere nearby New York City, there is no reason not to be there!

New host for planet.postgresql.org

This post is to confirm that planet.postgresql.org is now running off a new host.

If you clicked a link a while ago and got an error, and can now see this, that just means your DNS has now refreshed...

Yes, the mailinglists are down

and along with them, a few other services.

From what we can tell, what has happened is that the datacenter that hub.org hosts most of their servers in, in Panama, dropped completely off the Internet several hours back. The PostgreSQL mailinglists are managed by hub.org, and is tied into their main infrastructure. For this reason, there is nothing the rest of the sysadmin team can do other than wait for the situation to resolve, and we unfortunately have no chance to bring up any backup servers anywhere.

As an added unfortunate bonus, it seems at least one of the hub.org nameservers is still running an incorrectly configured DNS zone file. This means that while this server is geographically hosted elsewhere, like it should be, email will get delivered to that host and then bounce saying that the postgresql.org domain does not exist. This is incorrect - the domain itself exists and works perfectly well, and if it wasn't for this incorrect zone file mail would be queued up and delivered once the main datacenter is back up.

Along with the lists, a few other services hosted with hub.org are currently unavailable - pgfoundry.org, pugs.postgresql.org, the developer documentation, jdbc.postgresql.org and possibly some other minor services.

All other infrastructure services are operating properly, including the website and the download mirrors.

Please be patient as we wait for hub.org to resolve this issue. For any up-to-date status information your best bet is the #postgresql IRC channel on FreeNode - but people are unlikely to be able to provide any information beyond "it's down, and we're waiting for hub.org".

Another step towards easier backups

Today I committed the first version of a new PostgreSQL tool, pg_basebackup. The backend support was committed a couple of weeks back, but this is the first actual frontend.

The goal of this tool is to make base backups easier to create, because they are unnecessarily complex in a lot of cases. Base backups are also used as the foundation for setting up streaming replication slaves in PostgreSQL, so the tool will be quite useful there as well. The most common way of taking a base backup today is something like (don't run this straight off, it's not tested, there are likely typos):

psql -U postgres -c "SELECT pg_start_backup('base backup')"
if [ "$?" != "0" ]; then
   echo Broken
   exit 1
fi
tar cfz /some/where/base.tar.gz /var/lib/pgsql/data --exclude "*pg_xlog*"
if [ "$?" != "0" ]; then
   echo Broken
   psql -U postgres -c "SELECT pg_stop_backup()"
   exit 1
fi
psql -U postgres -c "SELECT pg_stop_backup()"
if [ "$?" != "0" ]; then
   echo Broken
   exit 1
fi

And when you're setting up a replication slave, it might look something like this:

psql -U postgres -h masterserver -c "SELECT pg_start_backup('replication base', 't')"
if [ "$?" != "0" ]; then
   echo Broken
   exit 1
fi
rsync -avz --delete --progress postgres@masterserver:/var/lib/pgsql/data /var/lib/pgsql
if [ "$?" != "0" ]; then
   echo Broken
   psql -U postgres -c "SELECT pg_stop_backup()"
   exit 1
fi
psql -U postgres -c "SELECT pg_stop_backup()"
if [ "$?" != "0" ]; then
   echo Broken
   exit 1
fi

There are obvious variations - for example, I come across a lot of cases where people don't bother checking exit codes. Particularly for the backups, this is really dangerous.

Now, with the new tool, both these cases become a lot simpler:

pg_basebackup -U postgres -D /some/where -Ft -Z9

That simple. -Ft makes the system write the output as a tarfile (actually, multiple tar files if you have multiple tablespaces, something the "old style" examples up top don't take into account). -Z enables gzip compression. The rest should be obvious...

In the second example - replication - you don't want a tarfile, and you don't want it on the same machine. Again, both are easily handled:

pg_basebackup -U postgres -h masterserver -D /var/lib/pgsql/data

That's it. You can also add -P to get a progress report (which you can normally not get out of tar or rsync, except on an individual file basis), and a host of other options.

This is not going to be a tool that suits everybody. The current method is complex, but it is also fantastically flexible, letting you set things up in very environment specific ways. That is why we are absolutely not removing any of the old ways, this is just an additional way to do it.

If you grab a current snapshot, you will have tool available in the bin directory, and it will of course also be included in the next alpha version of 9.1. Testing and feedback is much appreciated!

There are obviously things left to do to make this even better. A few of the things being worked on are: Ability to run multiple parallel base backups. Currently, only one is allowed, but this is mainly a restriction based on the old method. Heikki Linnakangas has already written a patch that does this, that's just pending some more review. Ability to include all the required xlog files in the dump, in order to create a complete "full backup". Currently, you still need to set up log archiving for full Point In Time Recovery, even if you don't really need it. We hope to get rid of this requirement before 9.1. Another option is to stream the required transaction logs during the backup, not needing to include them in the archive at all. This is less likely to hit until 9.2. The ability to switch WAL level as necessary. For PITR or replication to work, wal_level must be set to archive or hot_standby, and changing this requires a restart of the server. The hope is to eventually be able to bump this from the default (minimal) at the start of the backup, and turn it back down when the backup is done. This is definitely not on the radar until 9.2 though.

pgindent vs dash

For those who don't know, pgindent is the tool used to indent the source code of PostgreSQL. dash is the shell that ships as /bin/sh on at least Ubuntu.

pgindent requires indent from BSD (we use a patched version from NetBSD, the source is available on the PostgreSQL ftp site), and specifically does not work with GNU indent. Guess what Ubuntu ships with.

The solution is of course a small script that runs BSD indent from a different directory, and also points out the typedefs.list file from the PostgreSQL git repo. Something like this:

#!/bin/sh

export PATH=src/tools/pgindent:$PATH
src/tools/pgindent/pgindent src/tools/pgindent/typedefs.list $*

Spot the error? Yeah, that calls /bin/sh, which is dash. Which gives some really interesting results with pgindent, none of which are what you expect.

So if you run pgindent through a script like this, be sure to use /bin/bash and not /bin/sh!

Feedback from PGDay.EU the final part - the venue and registration

The big change for PGDay.EU this year really was the switch from a university venue (first Monash University in Prato, then ParisTech in Paris) to a hotel venue (The Millennium Hotel in Stuttgart). We believe that much of the rest of the conference was an improvement over previous years - but it was an incremental improvement, whereas the change of venue was rather drastic. Looking at the feedback on this, I think we can conclude that this change was in general a positive one:

We're seeing a total of 75%25 who rate the venue as a 4 or a 5. Looking at the freetext comments, a large majority of them are very positive, but there are a few ones that stand out:

  • Several people mentioned it was bad that the two sets of rooms (Berlin vs non-Berlin rooms) were very far apart. This is definitely something that we noted, and will attempt to avoid next year.
  • A few people mentioned that it would be nice if the hotel was closer to the city center. This is definitely true - unfortunately, closer to the city center means higher prices. We hope to find something closer to a city center at a reasonable price next year - by making sure we start to look and book early enough.
  • A few people commented that we shouldn't hold this in northern/central Europe in December due to weather (snow anyone?). Our goal is to move the conference back to an earlier date during the autumn - again, the main reason we ended up in December this year was that we started looking for a venue too late.
  • A couple of people commented that the hotel room rates were too high at the Millennium. There were cheaper hotels around to use - but of course, those aren't as convenient. This wasn't helped by the fact that the hotel group rate dropped off the hotel website twice, causing some people to get their reservations at a higher rate.
  • Isolated people commented that they did not like the hotel - "too big, unpersonal" and "feels like a prison".

Amongst the positive ones we find a large number of comments saying that the "integrated venue" or "all inclusive" venue was a great step up.

Closely related to the venue, is the food. Unlike the big north American conferences PGCon and PG-East/West, we have for the past two years tried to provide proper lunches and not just sandwiches/boxed lunches. This obviously costs more money, but we believe it's worth it, and we think our visitors do. Last year we had a catering firm bring us assorted food, mainly cold cuts, at the conference venue, and this year we got proper lunch buffets (including multiple choices for dessert, of course..) at one of the hotel restaurants. I think the ratings speak for themselves - I would encourage those other conferences to look into improving their lunches as well!

A full 82%25 rated the food as 4 or 5. In the end, the cost for paying for a lunch "on ones own bill" would probably have cost more than half the conference fee - so we think we managed to provide some very good value. In fact, several people rated the food as being the best part of the conference(!)

There was, however, one person who said the food was one of the worst things about the conference - if you recognize that was you, we would very much like to know exactly why (no details were included) - please send me an email or write a comment here!

A few people commented on the large amount of food left over from lunch on at least one of the days - it is up to the hotel to decide what to do about that, but it is our belief that they do something "reasonable" with it - and not just throw it away. We know that the caterers last year delivered all leftovers to a nearby homeless shelter, for example. For next year, we will attempt to again get a specification from the catering/restaurant as to what happens to leftovers.

We feel that the overwhelming majority of our visitors found the changes an improvement, and we will therefor pursue something similar as our primary option for next year. We are always interested in improving further, of course, so if you have any other ideas - let us know! The final question we asked about the venue was where to hold the conference next year. Many were quite ambiguous in their suggestions ("big city in Europe" is in, "Hawaii" is out because we want to stick to Europe). Summarizing what we could gave us the following:

  • Obviously, we see a bias towards Germany - since we were in Germany this time. However, we are only going back to Germany next year as a last resort - we want to move around. We will eventually come back to Germany of course - but not next year.
  • Some people commented that they will not be able to attend in a country other than Germany because they wouldn't understand the language of the talks. To deal with this, we are considering adding non-local-or-english talks as well for next year independent of where it is - where German talks (along with French and maybe Spanish) would be included even if the conference isn't in Germany.
  • Our Germany community is also looking into creating a specific PGDay Germany next year, which will be a smaller event focused on the local market - something we as PostgreSQL Europe will help and encourage.
  • I'm surprised to find Stockholm so high up on the list - I promise I didn't put any of those votes in there myself!
  • It's good to note that all the cities having 2 or more suggestions were already on our list of places to look at for next year.
  • We will consider this input and start looking for venues. This time we will not attempt to decide and announce a city first and find a venue later, we'll do it in the other order.

The final part of our evaluation was considering the conference website and registration:

In general these are very good rates. I'm happy to see that more than 50%25 rate the website overall experience as 4 or 5 - that's a much better rating than it's being given by the people who edit the content on it! Same for registration, with very few people rating it really low. There's clearly some room for improvement though:

  • A few people commented they wanted non-paypal registration options. While the paypal system we use actually allow you to do a credit card payment without the need to sign up for paypal (which some people did not realize and thus sent us an email before registering asking about it), not everybody has a credit card (this is not America - or Sweden). We'd be very happy to hear suggestions for what to do here though - we've looked at many different options, and paypal turned out to be by far the best one. We need something that supports automation and is reasonably fast. We did also support bank transfer in extraordinary cases - but that's not something that can be automated (unless you are a much bigger customer to the bank than we are), and it takes a long time for some payments, since they have to cross borders. So - any suggestions are welcome, and our core registration system is designed to support multiple payment methods.
  • Nobody actually wrote in the conference feedback that we lack a good interface for bulk registration, but we are aware of this - we had a few (less than 10 in total) entities wanting to register more than 2-3 persons at the same time for a single invoice, and our current system does not provide a reasonable way of dealing with this. This is definitely something we need to work on for next year.
  • It's been suggested we add a "skill level" entry to each talk, to make it easier for an attendee to know if it's a beginner or advanced talk. This is definitely something we'll look at doing for next year.
  • One suggestion is we include a full list of all attendees including their email address in the conference handouts, to make it easier to contact each other. This is not something we're going to do as a general thing, since we don't want to go distributing such lists. But we may consider adding it as an opt-in feature, where you can choose on registration if you want to be included in such a list.
  • Several people suggested adding videos of the talks - either as realtime streaming or as downloadables. We're not likely to add a real-time streaming, but we are considering doing talk recording. It does add a fairly large amount of work though, so we'll be needing more volunteers to cope with it...
  • We need to make it more clear that 5 is the best and 1 is the worst on the feedback forms. We know a few people filled them in wrong (we hope it meant they gave us bad rates when they meant good, but we don't know that), and it was also mentioned in the feedback.

In summary, here are some reasons in graphical and textual forms why you should already put attendance to next years PostgreSQL Conference Europe in your budget:

Freetext comments: "The overall organization of that event was excellent." "Very good organization, great people, interesting talks, vibrant community in general. Lots of core dev presents, high level of knowledge." "Great organization from beginning (registration at the website, information prior to the event), arriving and registering (internet access already available, great t-shirt and backpack) to the conference itself (sessions, warning speakers about how much time is left), good food and drinks at the breaks and at lunch. Kudos to the organizers and everyone who helped make this happen." "I think the organisation was perfect. There where many people and all know where they had to go to." "The huge amount of information, inspiration and positive energy. Actually I hacked my first patch on the way back." "The people especially the staff :) Both keynotes were stimulating good dsicussions with my peers" "Very good conference. I felt really cosy there. As a noob to PG, I got a lot of information and I lost the fear of asking the experts (either on the mailing list or on IRC)." "The organization was really great. Maybe the best PostgreSQL conference I've attended so far."

That concludes my summaries of the feedback from this years PGDay.EU conference. If your specific comments haven't been called out here, don't worry - we still read them all and will consider them all for next year!

Finally, thanks again to all who helped make this conference great!

See you again next year!

Feedback from PGDay.EU - the speakers

The next issue of my "pie-chart-overflow blog posts about PGDay feedback" is about our speakers. The speakers are, if that's not obvious, the reason that people come to the conference. Having good speakers is an absolute requirement if we want to keep up the quality of the conference. Other things like venue and price are certainly important, but nothing compares to the actual content of the conference - which is provided by our speakers.

I'm very happy to say that we seem to have manage to keep the very high numbers for Speaker Quality that we had from last year (differing less than 3%25 which is well within the margin of error). The same goes for the scores our speakers got on their knowledge of the topic - indicating that we've managed to attract some of the most skilled speakers in the world. Which is not surprising given that in many cases, we the person speaking about a feature is actually the guy who wrote it. What is more surprising is that these same people are rated as very good speaker - which we all know isn't always true about your stereotypical developer.

Just like last year, we're not going to post the complete list of speaker ratings, given that they are easy to read wrong. But here is a list of our top speakers, excluding any that had less than 5 ratings. Any speakers who have fewer than 10 should be considered a very uncertain number, and I've again included the standard deviation to determine the uncertainty. We had a lot more speakers this year, so I have only included those scoring 4 or above this time around. Each speaker has received his own detailed score, of course.

Place | Speaker | Quality Score | Standard deviation | Number of votes 1 | Dimitri Fontaine | 4.8 | 0.5 | 8 2 | Mason Sharp | 4.7 | 0.9 | 11 2 | Magnus Hagander | 4.7 | 0.7 | 29 4 | Simon Riggs | 4.6 | 0.7 | 52 4 | Simon Phipps | 4.6 | 0.9 | 45 6 | Andreas Scherbaum | 4.5 | 0.7 | 34 6 | Ed Boyajian | 4.5 | 1.1 | 33 8 | Bruce Momjian | 4.4 | 0.9 | 54 8 | Gianni Ciolli | 4.4 | 0.8 | 38 8 | Tim Bunce | 4.4 | 1.0 | 10 11 | Jan Aleman | 4.2 | 1.0 | 11 12 | Tim Child | 4.1 | 0.8 | 9 12 | Michael Meskes | 4.1 | 1.2 | 10 14 | Bernd Helmle | 4.0 | 0.6 | 6 14 | Heikki Linnakangas | 4.0 | 0.8 | 30 14 | Linas Virbalas | 4.0 | 0.9 | 10

[HTML_REMOVED] The list based on Speaker Knowledge looks slightly different, but not very much. Given that our speaker knowledge has been rated even higher than speaker quality, I've only included those who scored 4.6 or higher (which is a fantastically high cutoff)

Place | Speaker | Knowledge Score | Standard deviation | Number of votes 1 | Tim Child | 5 | 0 | 9 2 | Joe Conway | 4.9 | 0.3 | 10 3 | Simon Riggs | 4.8 | 0.7 | 52 3 | Linas Virbalas | 4.8 | 0.4 | 9 3 | Magnus Hagander | 4.8 | 0.8 | 29 3 | Dimitri Fontaine | 4.8 | 0.5 | 8 7 | Andreas Scherbaum | 4.7 | 0.8 | 34 7 | Bruce Momjian | 4.7 | 1.0 | 53 9 | Mason Sharp | 4.6 | 1.2 | 11 9 | Heikki Linnakangas | 4.6 | 0.8 | 30 9 | Simon Phipps | 4.6 | 1.0 | 45 9 | Gianni Ciolli | 4.6 | 0.8 | 38 9 | Tim Bunce | 4.6 | 1.3 | 10 9 | David Fetter | 4.6 | 0.6 | 16

A great big thanks to all our speakers - you did a fantastic job.

We will need to work hard to keep up our recruiting of speakers for next years. If you were considering but decided not to submit a talk for some reason - please let us know why, so we can improve! Or if you have any ideas in general on our processes around this. For example, we had no female speakers at all this year - we know you're out there, and we certainly want you there, so what do we need to change to make this more interesting for you as a potential speaker? The same goes for other groups that we were missing of course: now is the time to let us know so we have the time to change things before next year!

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

FOSDEM+PGDay 2019
Feb 1-3, 2019
Brussels, Belgium
Nordic PGDay 2019
Mar 19, 2019
Copenhagen, Denmark

Past

PGConf.Asia 2018
Dec 10-12, 2018
Tokyo, Japan
DC PostgreSQL Users Group
Nov 14, 2018
Washington DC, USA
New York City PostgreSQL User Group
Nov 13, 2018
New York City, NY, USA
Driving IT 2018
Nov 2, 2018
Copenhagen, Denmark
PGConf.EU 2018
Oct 23-26, 2018
Lisbon, Portugal
More past conferences