PostgreSQL vs 64-bit windows

Several completely independent people have asked me this in email already, so I figured it's about time for a blog post on the topic of PostgreSQL on 64-bit versions of Windows.

First for the simple answer: No, there is no 64-bit version of PostgreSQL for Windows. PostgreSQL has supported 64-bit environments on Unix for many years (long before we had x64 to make it available for wintel machines), but there is no Win64 port. Yet. And given the way that PostgreSQL is developed, there is no firm date for when this will be available.

Until then, though, the 32-bit version of PostgreSQL runs perfectly fine on 64-bit Windows. And doing so may actually be a better idea in some cases than a native 64-bit version. But most importantly, the architecture of PostgreSQL makes the requirement for a native 64-bit version much less than for many other competing databases. Particularly:

  • PostgreSQL relies on the operating system to do much of the caching of data. Since the main limitation of running a 32-bit process is the total amount of memory being addressable, a system that is based on the database engine doing all the caching cannot address all the memory on a system with say 16Gb memory. For PostgreSQL, we leave much of that caching to the operating system which can address it, as long as the OS is 64-bit.

  • PostgreSQL uses a multi-process architecture, not multi-thread. I'm not going to get into the discussion about which is better, just summarize by saying there are good and bad points to both of them. In this case it's a good thing. In a multi-threaded database server, all client connections share the memory space, and are again limited by the total addressable memory region. With PostgreSQL, you could easily have more than 1Gb per backend (if necessary) without running out of memory, which significantly pushes away the point when you need to go to 64-bit.

So why would it ever be better to run a 32-bit version than a 64-bit version? Simply because a 64-bit version is more memory-hungry. Every pointer used in the system takes twice as much space. Every int takes twice as much space. And if you're not actually going be able to cash in on the benefits (by addressing loads of memory, for example), it's a very real cost.

I'm well aware that this is a fairly simplistic view on the 32- vs 64-bit situation in general, but it should be enough to explain why 32-bit PostgreSQL on 64-bit Windows isn't such a bad idea after all. And of course, this only applies if you are using an x64 version of Windows - with Itanium it's a whole different story.

Eight-point-three

Despite a couple of people pre-announcing it, PostgreSQL 8.3 was actually fully announced and released about 40 minutes ago. Announce email should be going out, and the press release is available. Instead of just reiterating the new features, I'll point you to the list of cool new stuff. Or check the release notes.

If you haven't tested your apps with it during beta or RC time, now is the perfect time to do it!

There, I just had to get this one in before Devrim did. Ha! I win this time!

Workaround for ident security issue

Some people, like Josh Berkus for example, claim that both trust and ident can cause the security issues with dblink. Which is true - they can. But unlike the blanket statements that they are insecure, there are ways around it.

Trust will always be insecure when dblink or similar functions are present. But there's a simple way to make ident not be insecure - just make sure that the postgres superuser does not have the same username as the OS user that runs the processes. In fact, make sure that there is no PostgreSQL user with the same username as the OS user.

initdb will make those userids the same by default, but you can change that using the -U parameter to initdb. In theory you can rename the superuser after the fact as well, but I haven't tried that so I don't know how safe it is...

Faking the dbo role

Long-time users of SQL Server will be familiar with the dbo user. This is a special user that's present in all databases, and represents the database owner - regardless of who that is. Another special thing about it is that all users of the server role System Administrators will be mapped to this user as well. A third special thing is that unlike all other objects, those owned by dbo don't need to be qualified with the owner (think schema) name before they are accessed.

Now, there are a lot of legacy applications out there that depend on this behavior. To make those work, you often have to deal with these three issues. There are obviously a lot of other things to deal with as well, but those are not on the topic of this post.

The first part is easy. Just GRANT the dbo role to the users as needed.

The third part is also easy - use the public schema instead. Make sure that the dbo role has permissions to write to this schema. This will work as long as the application doesn't schema-qualify the names. If it does, you'll need to create a schema called dbo and grant permissions on it to the dbo role.

I've previously had problems with the second part, until I tried something that's actually fairly obvious once you think about it. But I never did before, so here goes. When user joe logs in, even if you grant him CREATE permissions on the public schema, anything he creates will end up owned by joe and not dbo. To get around this, just run the following command for all users:

ALTER USER joe SET ROLE dbo;

That way, whenever joe logs in, the current role will be automatically switched to dbo, and all objects created will be properly owned by dbo.

New security updates are out

I've just approved the website updates for new security updates across all branches. Depending on how you use your database, these updates can be critical for updating or not at all. You should go look at the announcement right away to see if you need to upgrade! (if the link is a 404, you need to wait for the mirrors to sync up which should take less than an hour)

The same bugs are present in the 8.3 betas, and fixes will be included in 8.3RC1, due to be announced shortly.

Also, note that this it the EOL release for 7.3 - if you're still on 7.3, you need to upgrade now.

Database or schema

A couple of weeks back I helped a guy out with a fairly large PostgreSQL installation, with several things. One of them prompted me to write this post. A fairly common question is "should I use databases or schemas to isolate my users". A fairly common answer is "schemas is the same as databases in other products", which isn't actually true (at least not for all of them). Schemas in PostgreSQL can be very flexible though - much more so than in say SQL Server, where a schema and a user is pretty much the same thing. But there are reasons for using databases, and reasons for using schemas. This was a textbook example on why to use schemas instead of databases.

The system they run has the database hosted in on their server, and a bunch of clients of different kinds (windows fat clients connecting directly and mobile devices connecting through an appserver). The system was set up with one database for each client, and all connecting using the postgres superuser account (oops. but that's a different thing).

An interesting point in this was that apart from the dedicated information for each client, there was a shared address table that had about 300Mb (IIRC) of data in it. This was 99.9%25 the same data between all clients, but each client had the ability to add extra rows to the table.

One of the things I ended up doing with them was to change the model to a single database shared by all clients, with one schema for each client. By giving each client their own account instead of using the superuser account, almost no changes were necessary in the client application - they would automatically see the tables in their own schema (since it's named the same as the account), but have no access to the data from other clients. There were several advantages gained by this:

  • Less overhead, since there were much fewer databases.
  • The ability to do cross-schema queries when needed, to perform cross-client statistical calculations (not by the clients themselves, but from the provider)
  • Less databases to remember to back up ;)

With that step done, the next step was very logical. We moved the address table to the public schema, and replaced it with a view in each of the client schemas. This view was a UNION ALL of the address table in the public schema, and a table called address_local that lived in each clients schema. This table was populated with all the entires the clients had added to their local copy. A couple of RULEs on the view would redirect modifications to the view to hit the local table (since the clients only had SELECT permissions on the shared table, and they really shouldn't modify that). Later on we changed the application to write directly to the address_local table, purely for performance reasons. This brought in several more advantages:

  • There is now only one copy of this large, shared table in the database. That greatly increases the probability of it staying in the cache, as well as reduces the risk that a scan on this table will push other tables out of the cache.
  • The address data is updated occasionally from an external source. Now it only has to be updated once, which will affect all clients
  • Backups are a lot smaller since the address data is only backed up once.

These are all very good arguments for using schemas instead of databases. But what are the downsides?

  • You can't use pg_hba.conf to restrict database access. In this case, that wasn't a reason, since they were all using the same account before. But it's a feature that will no longer work.
  • LISTEN/NOTIFY is per database. So the name of the NOTIFYs must change to include the name of the client
  • Restoring a single client can be somewhat more complex

I'm sure there are other arguments on both the positive and the negative side, but those are the ones we came across.

How's that for saving memory?

PostgreSQL on Windows has suffered from running out of "Desktop Heap" since day one. Things got worse with 8.3beta1, but Dave got that fixed for beta2. However, someone on the list actually managed to run into a different memory scalability problem - running out of address space in the postmaster.

So how could this happen? Because there were too many threads running. Hold on, you think, PostgreSQL isn't threaded?! Which is correct - PostgreSQL on Windows uses the same process based model as it does on Unix. The reason for this is to make sure that the database works the same way on all platforms, which cuts down platform-specific bugs and makes maintaining easier. However, a couple of win32 specific functions are implemented as separate threads. In the postmaster, for example, we start one thread for each child process spawned. The only purpose of this thread is to wait until the child process dies, and then fire off a SIGCHLD signal - since Windows doesn't provide this functionality built in like Unix does.

So with a lot of backends, this leaves us with a whole lot of threads running in the postmaster that doesn't really do anything. Since they don't do anything, this method was chosen originally for 8.0 with the motivation that the code is trivial and that the overhead of starting a backend process is so big that the starting of a secondary thread won't be noticed at all. Which is correct in all aspects except for one - the thread will require 4Mb of address space for it's stack. Most of this is never put into real memory (only the required initial 64Kb are), but it counts against the address space and the commit charge. This gives two possible resources that can run out - room in the pagefile (for the commit charge) and the address space in the process (max 2Gb memory can be addressed by a 32-bit windows process by default). And with 4Mb / thread, you get up there fairly quickly - something nobody thought about at the time.

So for 8.3beta2, I have rewritten the handling of SIGCHLD from scratch. Instead of creating a thread for each backend, we now use an operating system thread pool to do the waiting, and then queue up the child process exists on an I/O completion port. The second part of switching to using an I/O completion port actually simplified and reduced the amount of code necessary pretty substantially, which is always nice.

So did this help? Some pgbench figures from Dave shows:

Without patch:
VM: 1,322,792K
Idle threads: 6     
Peak threads: 306   


With patch:
VM: 98,088K     
Idle threads: 3 
Peak threads: 7

So yes, it makes a huge difference. Yay! Apart from fixing the address space issue, this should also decrease the pagefile requirements for any environments with a lot of backends. You're still going to see significantly worse performance than on Unix with lots of backends - shared memory access is still slow, etc. But it does represent a fairly big step forward.

Even the fact that mingw is once again missing the required functions thus causing breakage on the buildfarm makes that difference smaller. And a patch is in the works to fix that...

Thanks to Trevor and Dave for helping me track this down and test the resulting patch.

Integrated Security in PostgreSQL 8.3

An often requested feature for users migrating from Microsoft SQL Server to PostgreSQL is being able to use what Microsoft calls "Integrated Security". In a Microsoft environment, this means to automatically use your Windows account to log in to the database. It does not mean to use the same password. It means the login has to be transparent. (I believe at least one of the other major databases can do this already, but most requests come from MSSQL users)

Prior to PostgreSQL 8.3, this was possible to do using Kerberos 5 authentication, but this only worked if your PostgreSQL server was running on Unix (clients could be Unix or Windows). PostgreSQL 8.3 brings this to the native Windows version, and also makes things a lot easier to configure and use in a pure Windows environment. The core of the functionality for this is the addition of GSSAPI and SSPI authentication. GSSAPI is an open standard for authentication, that is typically used for Kerberos authentication. SSPI is a Microsoft implementation of this, that is wire compatible with GSSAPI (it's not API compatible, unfortunately). Native Kerberos 5 will still be supported in 8.3 for backwards compatibility, but it's deprecated in favor of GSSAPI in all installations that don't need to be compatible with 8.2 or earlier.

PostgreSQL 8.3 adds support for both these authentication methods. GSSAPI is supported on both Unix and Windows, and SSPI is supported only on Windows. The Windows build will always build with SSPI support. GSSAPI support requires the MIT Kerberos for Windows package to build (the same library used for the krb5 authentication used in pre-8.3 versions). The binary distribution will be shipped with GSSAPI enabled, but if you do your own build from source it's off by default.

There are a couple of ways to use this:

To achieve the same functionality as in pre-8.3 but using GSSAPI instead, configure the Linux server the exact same way as for Kerberos authentication. In pg_hba.conf, specify authentication method gss instead of krb5. Unix clients with GSSAPI support and Windows client with GSSAPI or SSPI support will then be able to authenticate. Note that a Windows SSPI client will be able to authenticate to the Unix server without needing the GSSAPI library.

To run the server on Windows and be able to authenticate to it from both Windows and Unix (or just Unix), configure things the same way as above. This setup requires the GSSAPI library to be linked in on the server.

To run the server on Windows and all clients on Windows, it's trivial to set up. In pg_hba.conf on the server, just specify sspi as authentication method, and you're done. All users connecting from the local machine, your domain, or a trusted domain will be automatically authenticated using the SSPI configured authentication (you can enable/disable things like NTLMv2 or LM using Group Policy - it's a Windows configuration, not a PostgreSQL one). You still need to create the login role in PostgreSQL, but that's it. Note that the domain is not verified at all, only the username. So the user Administrator in your primary and a trusted domain will be considered the same user if they try to connect to PostgreSQL. Note that this method is not compatible with Unix clients.

In almost no case do you actually need to configure and use the MIT Kerberos library on the client. The only case when this is needed is if you specifically don't want to use the Active Directory login, but instead use a standalone Kerberos login based on MIT Kerberos. If you need this, specify the connection parameter gsslib=gssapi (or use the environment variable PGGSSLIB). Otherwise, the client will default to using the SSPI library.

Speeding up NOT IN

Sometimes, old tricks can still work.. I had a case the other day where I needed to insert into a table all rows "missing" compared to a different table. Just the id value, and let all others get the default value. The query to do this the simple way is:

INSERT INTO table2 (itemid) 

  SELECT id FROM table1 

  WHERE id NOT IN (

    SELECT itemid FROM table2

  )

This would be fine, except there are a lot of rows in the tables. Specifically, table2 contains about 10 million and table1 contains 15 million. So the insert should be about 5 million rows. It's not huge, but it's enough to matter.

Left this query running for hours. And hours. And hours. The data fit fine in RAM, so it spent it all at 100%25 CPU and no I/O. Left it running for 12+ hours, still not finished. So it was time to try something else. Tried to rewrite the query to:

INSERT INTO table2 (itemid)

  SELECT id FROM table1

  WHERE NOT EXISTS (

   SELECT * FROM TABLE2 WHERE table2.itemid=table1.id

  )

But no luck. Terminated this one after 7 hours. The query plan for both of these was the same:

Seq Scan on table1  (cost=0.00..3902265020963.13 rows=7539834 width=4)

  Filter: (NOT (subplan))

  SubPlan

    - Seq Scan on table2 s2  (cost=0.00..258776.46 rows=1 width=38)

          Filter: (itemid = $0)

So it's time to bring out the dirty tricks. Specifically something that I do recall at least old versions of access used to do - back when they simply didn't do NOT IN subqueries (yes, I'm slightly embarrassed, but I did hack around access a bit at that time when forced to). The new code is (uses a temporary table, but that's not directly related to the speed change):

INSERT INTO loader (id)

  SELECT id FROM table1

  LEFT JOIN table2 ON table1.id=table2.itemid

  WHERE table2.itemid IS NULL

And wham. Query completes in less than 30 minutes. The INSERT into the actual table is very quick - single digit number of minutes, didn't get an actual measure for it. The new query plan is:

Hash Left Join  (cost=435127.62..2902887.99 rows=1 width=4)

  Hash Cond: (table1.id = table2.itemid)

  Filter: (table2.itemid IS NULL)

  - Seq Scan on table1  (cost=0.00..1880248.68 rows=15079668 width=4)

  - Hash  (cost=229150.50..229150.50 rows=11849450 width=4)

        - Seq Scan on table2  (cost=0.00..229150.50 rows=11849450 width=4)

For some reason the row estimate look differently - but it's the same tables, and nothing happened in between. And either way - it's slightly hackish, but it worked.

pgDay is over

After two days of great talks, pgday is over. The event also contained a lot of community networking, which to me was more valuable than the talks themselves. But that's mostly because I knew most of the people giving talks in English and what they had to say, and I didn't understand the ones in Italian. The translation wasn't good enough for me in most of the cases really, but I could tell the talks were interesting for those who fully understood them.

Today was sightseeing day in Florence with Dave, Simon, Greg, Josh, David, Andreas, Stefan and Susanne (and of course our hosts Gabriele and Andrea - and anybody else I forgot about). After a couple of days in there, we also ran into Ishikawa. A bit too hot for my taste really, but an interesting day. Florence is a nice city, and it mixes well with PostgreSQL conversations.

After dinner, we almost missed the train home. Actually, the train was replaced with a bus, which we almost missed. But it all worked out in the end, and now we're all (well, that's me, David, Josh, Stefan and Andreas) in David Fetters room cheering him and Ads on while they are writing up the PostgreSQL weekly news.

Pictures from this day and the previous ones are available on other peoples blogs, for example Andreas.

On a closing note, Stefan just informed me that the benchmarks previously referred to are now posted.

And finally, a great big thanks to the people organizing this great event, and all the volunteers helping out with it. I'm impressed with the result! When I leave tomorrow, I'm certainly very happy having been here!

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

PGDay Chicago 2024
Apr 26, 2024
Chicago, USA
PGConf.DEV 2024
May 28-31, 2024
Vancouver, Canada

Past

SCaLE 2024
Mar 14-17, 2024
Pasadena, USA
Nordic PGDay 2024
Mar 12, 2024
Oslo, Norway
FOSDEM PGDay 2024
Feb 2-4, 2024
Brussels, Belgium
PGConf.EU 2023
Dec 12-15, 2023
Prague, Czechia
PGConf.NYC 2023
Oct 3-5, 2023
New York, USA
More past conferences