Can't beleive nobody blogged this yet...

The conference has pretty much kicked off, with the EnterpriseDB dinner cruise. Loads of interesting people to talk to... So far, this is shaping up to be a very good conference. Now let's see how it is tomorrow when the actual talks start - but my hopes are high, and I fully expect them to be fulfilled...

Building the PostgreSQL docs on Windows

PostgreSQL documentation is in DocBook format. In my experience, this is far from trivial to get working on any platform where it isn't installed by the default packages - and even worse if it's installed by default but in the wrong version.

Turns out on Windows it's both better and worse - it's definitly harder than a pre-installed system, but a lot easier than one with the wrong version installed. Here are some step-by-step instructions for how to do it in a way that uses only pre-made binary packages of the required tools. It's a manual build not integrated with the pg Makefile, but it's a start... First of all, decide on a root for the docbook installation. In my case, I chose f:\docbook. Download and uncompress the following files:

  • ActiveState Perl, installed anywhere and present in the PATH
  • OpenJade 1.3.1-2 binaries, uncompress to f:\docbook\openjade-1.3.1.
  • DocBook DTD 4.2, uncompress to f:\docbook\docbook.
  • DocBook DSSSL 1.79, uncompress to f:\docbook\docbook-dsssl-1.79.
  • ISO character entities, uncompress to f:\docbook\docbook.

With this done, some files in the docbook directory need renaming. Create a .bat file named r.bat (or whatever, really) in the docbook\docbook directory that looks like this:

@echo off
for %25%25f in (ISO*) do (
set foo=%25%25f
ren !foo! !foo:~0,3!-!foo:~3!.gml
)

And execute this file using:

cmd /v /c r.bat

That done, step into your PostgreSQL installation tree in doc\src\sgml. Run the following commands (beware of linebreaks!):

set SGML_CATALOG_FILES=f:\docbook\openjade-1.3.1\dsssl\catalog;f:\docbook\docbook\docbook.cat

perl f:\docbook\docbook-dsssl-1.79\bin\collateindex.pl -f -g -o bookindex.sgml -N

perl mk_feature_tables.pl YES ..\..\..\src\backend\catalog\sql_feature_packages.txt ..\..\..\src\backend\catalog\sql_features.txt > features-supported.sgml

perl mk_feature_tables.pl NO ..\..\..\src\backend\catalog\sql_feature_packages.txt ..\..\..\src\backend\catalog\sql_features.txt > features-unsupported.sgml

f:\docbook\openjade-1.3.1\bin\openjade -V draft-mode -wall -wno-unused-param -wno-empty -D . -c f:\docbook\docbook-dsssl-1.79\catalog -d stylesheet.dsl -i output-html -t sgml postgres.sgml

perl f:\docbook\docbook-dsssl-1.79\bin\collateindex.pl -f -g -i 'bookindex' -o bookindex.sgml HTML.index

f:\docbook\openjade-1.3.1\bin\openjade -V draft-mode -wall -wno-unused-param -wno-empty -D . -c f:\docbook\docbook-dsssl-1.79\catalog -d stylesheet.dsl -i output-html -t sgml postgres.sgml

Yes, it runs the build twice. That's to generate the indexes. If you have changed your docs, you usually don't need to re-run that part - just run the final openjade call over and over again.

This is all run outside of msys, with no requirements at all on either mingw or cygwin.

Not so active

Not been doing too much PostgreSQL related stuff lately (thus no blog posts). Couple of reasons - lots to do at work before summer, and since it's now sailing season no (or at least a lot less) PostgreSQL in the weekends.

Quick status report on my current pg work: The win32-build-with-visual-c++ initial code patch has been applied. Still to do is a couple of minor things (there's a tiny patch for pl/perl for example), and then the actual build scripts. The build scripts work, but they are not cleaned up all the way yet. I also need to finish porting over things like gen_fmgroids that will be needed to be able to build from CVS.

Finally, I need to do something around the regression tests. Haven't decided yet if it might be easiest to just run the regression tests under msys against a VC++ built backend, or if I need to get them running outside it yet. And it does have to integrate with the buildfarm some way or other... (Currently running parts from msys and parts outside, and the backend does pass all the regression tests, but it's not trivial to run them)

But hopefully it'll all be worth it - we'd finally get things like proper debugger support on Windows! (And faster buildtimes, and judging by the results of other projects slightly better actual performance as well)

Right now I'm not particularly disappointed that I didn't get a slot at the conference, consdering that'd give me an even tighter schedule right now. Now I get to just listen and pick on others ;-)

And finally, happy midsummer to all those who celebrate it - and of course to those that don't as well.

Commandline editing in psql on win32

Unix has readline. Or libedit. At least most of the time. Win32 has a much less capable commandline editor, but at least it has something. Unfortunatly, it's not something most people know about. There are a couple of nice features there that will help you out if you like using the commandline to work with your database. It does have the basic commandline editing and history functions, and also some (reasonably simple) macro handling.

First one of the main limitations - settings and data aren't persisted across sessions. Actually, they do survive a psql restart as long as it's restarted under the same cmd.exe, but if cmd is restarted, all is lost. This includes both history data, and macros. Very unfortunate, but you can learn to live with it...

Working with the commandline history is quite simple. The basic keys to use it are the usual: * Up/Down arrow - move to previous/next command * PgUp/PgDn - move to first/last command in current session * left/right/home/end - move around on the current line * ctrl-left/ctrl-right - move word left and right

There are also a couple of special keys to copy/paste parts of rows (same ones as on the commandprompt - which really goes for all these). Another very handy key is F7, which will bring up a list of all the commands in the history buffer in a popup window, where you can pick a command to paste in at the current line.

psql history screenshot

Finally, there is the macro functionality. Unfortunatly, your macros will be lost at restart of cmd, so if you just use the Start Menu icon for psql, you can't really use them. You can, however, create your own BAT file that will first register a bunch of macros, and then start psql. For the most basic form of macros, just run:

doskey /exename=psql.exe pgd=SELECT datname FROM pg_database;

To use this macro, simply type in the macro and hit enter. The macro has to be at the beginning of a line.

postgres=# pgd
datname
-----------
postgres
template1
template0
(3 rows)

You can also pass parameters to the macros, for example:

doskey /exename=psql.exe dbinfo=SELECT datname,pg_encoding_to_char(encoding) FROM pg_database WHERE datname='$1';

And then just pass a parameter on the commandline:

postgres=# dbinfo postgres
datname | pg_encoding_to_char
----------+---------------------
postgres | SQL_ASCII
(1 row)

Those of you who do a lot of hackery with the Windows commandprompt will recognise this as standard doskey macros and keys. Those who aren't might be interested in reference from Microsoft.

GIN hits cvs

Trying to be the first to get this up on the planet ;-) Teodor has just committed a first version of GIN (Generalized Inverted iNdexes) to CVS HEAD. This promises even better performance and functionality for many things that previously used GiST, specifically fulltext-search! Yay! Great job, guys!

For much more information, see discussion on -hackers.

Replicating from MS SQL Server to PostgreSQL

Note: I'm sorry, all screenshots were lost in the planet crash. But I managed to recover the text, hopefully it helps someone.

There can be many reasons for wanting to replicate your data from a MS SQL Server installation to your PostgreSQL installation. For example, as a step in migration or to be able to use PostgreSQL features for data analysis while not having to touch existing clients working with MSSQL. For me in this case, I wanted to use tsearch2 to search some fulltext data, because the fulltext indexer in SQL Server really isn't very good.

It turns out that SQL Server ships with replication functionality that can solve this problem with relatively little pain (depending on your schema of course), providing full transactional replication. It's master/slave only, and SQL Server will be the master, but it's still quite useful. Here are the steps to do this for a simple example database - should work for more complex database as well of course. It expects you to set up a user named sqlrepl in the PostgreSQL database, that the replication system will use to connect with. Make sure that this user has permissions to connect from the SQL Server machine in pg_hba.conf.

  • Make sure you have the PostgreSQL ODBC drivers installed on the SQL Server machine (I'm using version 8.01.02).

  • Create the example databases:

  • In SQL Server:
            CREATE DATABASE origin
            go
            USE origin
            go
            CREATE TABLE tab1(
            id int identity not null primary key,
            t varchar(128) not null)
            INSERT INTO tab1 (t) VALUES ('Test 1')
            INSERT INTO tab1 (t) VALUES ('Test 2')
  • Then, in PostgreSQL:
            CREATE DATABASE slave OWNER sqlrepl;
            \connect slave
            CREATE TABLE tab1(id int not null primary key, t varchar(128) not null);
            ALTER TABLE tab1 OWNER TO sqlrepl;
  • Create a ODBC System Datasource on the SQL Server. This is done using the odbcad32 command. Make sure that you create a system datasource, and make sure you use the PostgreSQL ANSI driver (there are some problems with the UNICODE driver in the way SQL Server uses it)

  • Start SQL Server Enterprise Manager. Create a new linked server. This is done by right-clicking the Linked Servers node under Security and picking New Linked Server. Enter the name of the linked server (in all uppercase, in our case PGSLAVE), and pick the driver Microsoft OLE DB Provider for ODBC Drivers. Note that you should not pick the PostgreSQL ODBC driver here. Finally, enter the name of the ODBC datasource just created. Make sure the link works by clicking the Tables node and verify that you can see the tables of you database.

  • Configure the subscriber:

  • Right-click on the Replication node and select Configure publishing, subscribers, and distribution.
  • Select the tab Subscribers
  • Click New
  • Select OLE DB data source
  • Pick the linked server you created (PGSLAVE). Re-enter the login information.
  • Click OK and close all dialogs

  • Create the publication:

  • Right-click Publication under Replication and select New publication
  • Select your database, click Next
  • Select Transactional publication, click Next
  • Uncheck SQL Server 2000 and check Heterogeneous data sources, click Next
  • Click Article Defaults
  • Open the Snapshot tab
  • Change name conflicts to Keep existing table unchanged. In some cases it will work with drop and recreate, but I prefer creating the tables manually to make sure there is no mixup with datatypes and such (considering MSSQL doesn't really know about PostgreSQL datatypes)
  • Click OK
  • Put a checkbox on the tables to replicate (tab1 in this example). Click Next
  • You will get a warning about IDENTITY properties not being replicated to subscribers. This will happen if you have any IDENTITY columns in your table. In most cases, you can just ignore it. Click Next
  • Possibly modify description if you want to, click Next
  • Click next through the rest of the Wizard, and click Finish
  • Create the subscription:
  • Open properties for the publication
  • Open the Subscriptions tab
  • Click Push new
  • Select the created subscriber (PGSLAVE)
  • Click Next through the rest of the wizard, make sure to check the box for Start snapshot agent
  • Now sit back and watch your MSSQL data first being bulk-loaded into PostgreSQL, and then transactionally replicated (you will see a couple of seconds delay after a commit, same as when you replicate between two SQL Servers).

If you have more advanced needs (which you often do), you can use most other features of SQL Server replication, such as row and column filtering.

Npgsql and async notifications?

If this and this is what I think it is, that's the best news all week. I've really been waiting for this. (And no, I'm not talking about Franciscos surgery, if that's what you thought). I've still got a single perl program running in one project waiting for this one.

Need to get this blog on planetpostgresql.org so I get the info faster...

Linuxforum summary

Better late than never, almost a week since it was over, a quick summary of this years Linuxforum Denmark.

I have to say I'm very impressed. With an organisation of volounteers and at a very low cost compared to other conferences (such as OSCON which is definitly a lot higher budget), I think they managed to arrange a fantastic conference.

As a visitor, there were plenty of interesting talk, and this year the english language ones were nicely spread out so there was almost always one available. I don't really do Danish. Reading is fine, but listening to a technical talk - well, it's a bit too much of a challenge. I think the split of a business-oriented day (friday) and a tech day (saturday) is a smart thing to do. So I'll definitly recommend people who are around go go to this conference next year.

As a speaker the arrangement was very nice - not just "here's your room, be done", but all the things around it. I certainly felt very welcome, and I don't think there was a single point not being taken care of.

As for my own talk, I think it got to be a little bit too much in too little time - I should've cut a few more points out of it and been a bit more detailed about the ones left in. But hopefully it wasn't too bad.

So, big thanks to the organisers. And to all the interesting people I got to chat with - hosts and other speakers. A very interesting couple of days.

Oh, and yes, I got stopped in airport security. Not for the powerdrill they gave us (speakers) at the conference (very handy, btw!), but for the speakers badge. And yes, they were in the same bag...

At last - the password bug is found

There has been a very elusive bug around in the PostgreSQL client since 8.0 (at least) on win32. In some cases (until now unidentified) password authentication in psql stopped working, and just errored out with a message indicating the wrong password was used. And nobody has found the cause...

... until now, that is. Seems Robert Kinberg finally nailed it. At least, I firmly beleive that the bug he did identify is the one responsible for most of these reports.

Turns out that if you have a directory called \dev on the same drive you run psql from (normally c:, but that can differ), psql will attempt to use a file called /dev/tty. On Unix, this is common behaviour - we want to be sure we read the password from a tty. On windows, it just breaks - the first time, we will write the prompt into the file, and on later runs we will read back the prompt and use it as a password.

The fix is simple, just two lines changed and then two added #ifdefs. The fix is scheduled to go into 8.1.4 and 8.0.8. It's not actually approved and committed yet, but I see no reason why it won't be :-)

If you're having this problem today, check for dev directories in your root, and remove/rename/removepermissions that directory, and things should be back to normal.

So, big thanks to Robert for identifying this and suggesting the fix!

Monitoring PostgreSQL on Win32

A very common question from Win32 users is how to monitor Postgres the same way you can do with ps on Unix. There's a simple trick to it - don't use Task Manager, use Process Explorer from Sysinternals. So why is this. For comparision, here is a snapshot of my laptop with PostgreSQL running, and just a single client connection:

Task Manager screenshot

The same processes, as seen from Process Explorer:

Process Explorer screenshot

The most obvious difference - you can now see the process tree. You will see pg_ctl that is the service manager, that it started the postmaster, and that which subprocesses the postmaster started. You will also see the only second level postgres.exe, which is part of the stats system.

You will also notice in the Handle list, that there is an Event named pgident: .... This event contains the information you would find in the ps output on Unix and identifies the individual PostgreSQL process.

So - simply using a different tool gives you a lot more information. For 8.2 there will be some additional win32 management stuff, but more aobut that somewhere else.

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

PGDay Chicago 2024
Apr 26, 2024
Chicago, USA
PGConf.DEV 2024
May 28-31, 2024
Vancouver, Canada

Past

SCaLE 2024
Mar 14-17, 2024
Pasadena, USA
Nordic PGDay 2024
Mar 12, 2024
Oslo, Norway
FOSDEM PGDay 2024
Feb 2-4, 2024
Brussels, Belgium
PGConf.EU 2023
Dec 12-15, 2023
Prague, Czechia
PGConf.NYC 2023
Oct 3-5, 2023
New York, USA
More past conferences