Viewing entries tagged with postgresql. Return to full view.

Todays trivial git tip

If you're like me, you are using the git.postgresql.org repository to do your PostgreSQL development, because it's much nicer to work with than CVS.

If you're also like me, you like git diff with it's nice coloring and trailing-whitespace-warnings and such features. But you're a little bit annoyed that your tabs come out as 8-characters, when the PostgreSQL source uses 4-character tabs, making diffs a bit hard to read. But if you just pipe the output to less or something, the coloring goes away.

I finally got around to looking for a way to fix that today. And it took me all of 2 minutes to find it - I really should've done this before. Put the following in your .git/config file:

[core]
  pager = less -x4

and it'll show you the diffs with 4-space tabs.

Trivial, yes. But it took me this long to even look at fixing it, so hopefully this can help someone...

Last chance to speak at FOSDEM devroom!

The CFP for our FOSDEM devroom ends in a couple of days!

If you haven't submitted your talk yet, now is the time to do so! We still have a few slots open. We are interested in both full-length, half-length and lightning talks!

And if you weren't planning to submit a talk - now is the time to rethink! We want a good mix of different types of talks, ranging from hacker talks to user experiences.

So, head over to the wiki and read up on the details, and submit your talk!

Automatically dropping and creating constraints

I do this fairly often, but after talking to some other people I realized it might be a good idea to share a couple of quick SQL scripts. The idea is: you have some large operations you are doing on your database. Of the kind where you load or update lots of data - which means that it will run a lot faster with the constraints in the database turned off. All of them, or maybe just the FOREIGN KEYs, depending on exactly what you are doing. The following simple queries will generate SQL scripts that drop all your constraints, and then re-create them. Adding filters for just a single table/namespace/constrainttype is trivial, but left as an exercise for the reader...

To generate a script to drop constraints:


SELECT 'ALTER TABLE "'||nspname||'"."'||relname||'" DROP CONSTRAINT "'||conname||'";'
 FROM pg_constraint 
 INNER JOIN pg_class ON conrelid=pg_class.oid 
 INNER JOIN pg_namespace ON pg_namespace.oid=pg_class.relnamespace 
 ORDER BY CASE WHEN contype='f' THEN 0 ELSE 1 END,contype,nspname,relname,conname

Note that the order is important - we must drop FOREIGN KEYs before we drop PRIMARY KEYs and UNIQUE constraints, since they depend on each other.

To generate a script to reload the constraints:


SELECT 'ALTER TABLE "'||nspname||'"."'||relname||'" ADD CONSTRAINT "'||conname||'" '||
   pg_get_constraintdef(pg_constraint.oid)||';'
 FROM pg_constraint
 INNER JOIN pg_class ON conrelid=pg_class.oid
 INNER JOIN pg_namespace ON pg_namespace.oid=pg_class.relnamespace
 ORDER BY CASE WHEN contype='f' THEN 0 ELSE 1 END DESC,contype DESC,nspname DESC,relname DESC,conname DESC;

Again, note the reversed order that is necessary so we create PRIMARY KEYs and UNIQUE constraints before we create the FOREIGN KEYs.

The easiest way to turn these commands into proper scripts is to just use the psql output feature:


mydb=# \t
Showing only tuples.
mydb=# \o drop_constraints.sql
mydb=# SELECT ...
mydb=# \o create_constraints.sql
mydb=# SELECT ...

You can then just load those files as include files in psql. Obviously, you need to script both the DROP and CREATE statements before you actually drop something.

And the second obvious note - always run these things in a transaction. Take advantage of the transactional DDL in PostgreSQL - as always. I often run the whole thing - drop, bulk work, recreat - inside a single transaction. That way, applications may not even need to know there is service happening...

Finally, in some databases you may need to add further restrictions, and always verify the scripts before you run them :)

A Thunderbird extension for the archives

I repeatedly find myself wanting to reference an email in the PostgreSQL mailinglist archives either in a different email or in an IM or IRC conversation. The usual two methods for this have been to either look up the messageid from view-source in Thunderbird, or to go find the message by fulltext search either on our archives or one of the other big sites. I got tired of that, and decided it was good time to make my first try ever at writing a Thunderbird extension. It wasn't particularly hard, though finding the level of newbie documentation I required proved to be a bit of a challenge.

The current version is hardcoded to run on Unix - it does require an /etc/alternatives/x-www-browser. That just means there is room for improvement :-)

Here's a screenshot of how to use it:

http://www.smugmug.com/photos/432794002_w2JdG-X3.jpg border="2"

Should be fairly obvious what to do...

You can get the extension itself at http://www.hagander.net/extensions/

Visibility map arrives

Since Heikki doesn't have a blog of his own, I will take it upon myself to blog about this one: he has just committed the initial version of the visibility map feature for PostgreSQL 8.4.

In short, this means that VACUUM will now only need to visit pages in a table that has actually changed. So if you have a large table where only a very few rows change, VACUUM will require significantly less CPU and I/O to run. With this lower cost, it's also possible to run VACUUM more often on large tables - if nothing at all has changed, it's more or less a no-op!

In the future, the hope is to be able to build on this patch to implement things like index-only scans - but that's for another version.

Yet another great feature for 8.4 - this is shaping up to be a very good release I think.

On the topic of release quality

This post was inspired by Montys post about MySQL release 5.1, and the many discussions it has created both around the web and offline. The question mainly being - why has nobody made such a post about PostgreSQL yet? Is it because it hasn't happened, or just because nobody has posted about it.

We all know that there is no such thing as bug-free software. This obviously includes both PostgreSQL and MySQL, as well as all the commercial competitors - claiming anything else is clearly untrue. But what I find remarkable from Montys post are mainly:

  • MySQL 5.1 has been released with known critical bugs (crash/wrong result). As far as I know, this has never been done with PostgreSQL (at least if we're talking "modern times", since after the product became reasonably stable). And certainly not the number of issues that Monty has listed for MySQL 5.1 - it's not just one or two!
  • There are also critical bugs that were present in 5.0, that still haven't been fixed in 5.1.
  • We already know that MySQL 5.0 was released "too early". We already know that MySQL 5.1 was declared RC "too early". It's remarkable that 5.1 was also released "too early" in that case - and for non-technical reasons again. In theory, this "can't happen" with the PostgreSQL release model, since it's based only on when the features are "ready", not when you need a new release for some other reasons. That's in theory. I know in the past we have tried to schedule releases around certain conferences and such, for better announcement effects. In practice, though, I think this has only led to a release being postponed, never being rushed.
  • MySQL apparantly keep some bug reports hidden from the public (the referenced bug 37936 for example). How's that for open... I approve of keeping them hidden for security bugs only - but if that bug is a security bug, it's clearly taken way too long to fix, given the dates on bugs around it.
  • I still think they've designed their version numbering system to deliberately confuse the customers about what is a beta, what is a release candidate and what is a release. What is so easy with either labeling them as PostgreSQL does (8.3beta, 8.3RC, 8.3.0), or using the in open source popular system of using even-numbered releases for stable releases and odd numbers for beta/testing releases?
  • It took them over a year to get from Release Candidate to release.

Now, there are several posts I found that are questioning Montys post, saying that the quality is just fine - and backing this up with actual experiences in deploying 5.1. I do think both sides are right here - it's perfectly possible to deploy 5.1 without hitting these bugs, as they are "corner-case" issues. But that does not decrease the importance of having releases without known bugs in them. And if there are known bugs, they should at least be listed very clearly in the release notes/announcement. Not doing this is, IMHO, simply irresponsible. Especially least for a database server which is supposed to safeguard all your work...

So how does PostgreSQL measure up

Continue reading

New planet administration interface

As of a couple of minutes ago, it is now possible for people who have their blogs aggregated on Planet PostgreSQL to administer their registration information online - no more need to send an email to us every time (but you can still do that if you want - planet(at)postgresql.org is the address to use!). There's a link at the bottom righthand side of the frontpage of Planet PostgreSQL to get there, and it uses your pre-existing community account to log in.

The functionality so far is fairly limited, but it's currently possible to: * Register a new blog for aggregation (needs approval) * Remove a blog from aggregation * Delete individual posts (will cause them to reload if they're still in the RSS feed) * Hide individual posts (that way they won't get reloaded) * Registered users will automatically get added to (and removed from) a mailinglist to receive announcements, discuss policy etc

That's it for now. We're happy to hear more suggestions for things to add to this interface.

When installing this, I have mapped all users I could to their existing community accounts. However, there are a number of blogs I failed to map either because they don't have an account, or because I couldn't figure out which one it is. I would ask these people to either go into the administration interface and register an account (you can attach an existing blog there - will need approval before it goes through) or just email me your community login (if you don't have one yet, please sign up for one). This way you will end up on the mailinglist for announce messages. The following blogs are currently without a userid: * Aurynn Shaw * Benjamin Reed * Chris Smith * Christopher Kings-Lynne * Dave Cramer * Enver Altn * Frank Wiles * Gavin M. Roy * Gavin Sherry * Ian Barwick * Jon Jensen * Kenneth Downs * Kenny Gorman * Leif B. Kristensen * Liam O'Duibhir * Ow Mun Heng * Paul Silveria * Robert Hodges * Robert Lor * Satoshi Nagayasu * Tom Copeland * Usama Munir Dar

I realize a number of these blogs are currently broken, due to the crash of people.planetpostgresql.org. If your blog was there, and you want a new one there, contact [mailto:devrim@commandprompt.com Devrim]. Otherwise, if you want to move it to another hosting like Blogger or Wordpress, like a number of people have done, just let us know and we'll update the address of your subscription (but we still want your community login!). Finally, if you want to blog removed instead of updated, send us an email (in this case you don't need to sign up for a community account, of course).

The interface is fairly ugly at this time - someone promised to work on those templates though, so stay tuned for something prettier...

Recovering blog data

Per indications from Devrim, I have given up on getting my blog entries back from the old planet machine. If they do show up, that would be a happy surprise, but I now consider it a very remote chance that it will.

Instead, I have now recovered some of my blog posts using things like google cache, the wayback machine and such methods. I will be cleaning up the old posts and turning them visible one by one over the next couple of days... If you happen to have some of my older posts saved away somewhere, please let me know :) And if you spot formatting errors in something you find, then let me know that as well.

Finally, if you're coming to this post from a redirect. Sorry, all the old URLs are "gone", and I don't know how to get a redirect for them. Please use the archive to browse to the post you were looking for.

Update: A bit quicker than I thought, my script did a fairly good job. I've turned everything I managed to recover visible by now...

PostgreSQL Europe website finally up

The website for PostgreSQL Europe is finally up!

It's long overdue - we really should've had this up before the summer. But better late than never.

And please - your contributions towards it are much appreciated! Right now it's all static content handled through a django template framework, and adding such content is as simple as adding a static HTML file. So if you're interested in contributing content, please look it over and see what you can do (the whole source is of course available in our git repository). And if you're not comfortable in HTML - just send us the text you think should go on there and we'll find someone to do the markup!

PostgreSQL SSL code updates

I am currently working on several updates to the POstgreSQL SSL code, to make it more secure and add some functionality. I'd be interested to hear from people who are either using this today, or are interested in using the new functionality - there is still room to make further adjustments to the code before the release.

Certificate validation in libpq

This patch was applied today. The idea is to be able to control how the certificate validation is done in libpq. Previously, libpq would verify the server certificate if a root certificate file was found, and otherwise never do it. This made the system very fragile. And it would never attempt to verify that the certificate actually matched the server.

With this patch, there is now a new connection parameter sslverify, that controls this behavior. It's all controlled by this parameter, and never by just checking the existance of a file. It can have the following values:
cn : Default. Verify that the certificate chains to a trusted root, and that the server name matches.
cert : Verify that the certificate chains to a trusted root, but ignore the name.
none : Disable certificate verification completely.

The version that is committed does not support subject alternate names or wildcard certificates. It's something I am hoping to have the time to add before the release. Feel free to send me a patch ;-)

Requiring a client certificate

This patch is currently pending review in this commitfest. The idea here is to move from having the requesting of client certificate to be controlled by if the root certificate file exists or not, to it being an explicit configuration variable. This makes it much more secure against "admin mistakes" - explicit configuration is always better when it comes to security.

This patch builds on the changes to the pg_hba.conf file, and therefor just adds a connection option to the hostssl rows (obviously you can only require client certificates on SSL connections). Set it to 1 to require client certificates. Of course, it also needs the root certificate file to be present.

Having this in pg_hba.conf also makes it possible to configure this value differently depending on which addresses your client are connecting from, if required.

Client certificate authentication

This patch is pending some final cleanups before I post it. The idea here is, obviously, to be able to use your SSL client certificate to perform the actual authentication, thus doing away with the need to have a password as well. Given that our client certificate code already supports for example smartcards (through OpenSSL), this can be a high security option for remote logins. I'm sure there are other usecases as well - it's a feature that have been asked for more than once.

I plan to make this code just use the cn attribute of the certificate to authenticate. This can then be passed through a pg_ident.conf map to map to "real" username, in case the syntax is not identical. In a lot of cases it can probably be very useful to combine this with regexp entries in the ident maps which is another patch that's in the queue for this commitfest.

One thing I'm unsure about here is - will it be enough to be able to use the cn attribute for authentication, or will it be required to use other attributes as well? How do the enterprise PKI solutions that you'd use this together with work?

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

Past

PGConf.DEV 2024
May 28-31, 2024
Vancouver, Canada
PGDay Chicago 2024
Apr 26, 2024
Chicago, USA
SCaLE 2024
Mar 14-17, 2024
Pasadena, USA
Nordic PGDay 2024
Mar 12, 2024
Oslo, Norway
FOSDEM PGDay 2024
Feb 2-4, 2024
Brussels, Belgium
More past conferences