Repartitioning with logical replication in PostgreSQL 13

Posted on Jun 10, 2020 at 19:43. Tags: postgresql, replication, partitioning.

So, you have a partitioned table. And you want to change your mind. Re-partitioning is "easy" if you can take downtime -- just create a new table with a new name and copy all the data over. But what if we want to try to do it without downtime? Logical replication enhancements in PostgreSQL 13 brings us some new options for this!

But first a disclaimer -- this is definitely not pretty! And does not take into consideration things like foreign keys and similar. But sometimes a quick hack can be the best hack.

So let's go!

For demo purposes, first we create a silly table, partitioned by date, with one partition for each day, in our postgres database, and fill it with some not-so-random data:

CREATE TABLE silly (
  t timestamptz NOT NULL,
  too_lazy_for_schema jsonb NOT NULL DEFAULT '{}'
)
PARTITION BY RANGE(t);

CREATE TABLE silly_20200610 PARTITION OF silly
  FOR VALUES FROM ('2020-06-10') TO ('2020-06-11');
CREATE TABLE silly_20200611 PARTITION OF silly
  FOR VALUES FROM ('2020-06-11') TO ('2020-06-12');
CREATE TABLE silly_20200612 PARTITION OF silly
  FOR VALUES FROM ('2020-06-12') TO ('2020-06-13');
CREATE TABLE silly_20200613 PARTITION OF silly
 FOR VALUES FROM ('2020-06-13') TO ('2020-06-14');
CREATE TABLE silly_20200614 PARTITION OF silly
 FOR VALUES FROM ('2020-06-14') TO ('2020-06-15');
CREATE TABLE silly_20200615 PARTITION OF silly
 FOR VALUES FROM ('2020-06-15') TO ('2020-06-16');

INSERT INTO silly (t)
 SELECT * FROM generate_series('2020-06-10 00:00'::timestamptz, '2020-06-15 23:00', '1 hour');

So far, but at some point we realize that it's silly to partition this by day, and instead we want to partition by week. So we create a new table to get the data over in:

CREATE TABLE newsilly (
  t timestamptz NOT NULL,
  too_lazy_for_schema jsonb NOT NULL DEFAULT '{}'
)
PARTITION BY RANGE(t);

CREATE TABLE newsilly_20200608 PARTITION OF newsilly
  FOR VALUES FROM ('2020-06-08') TO ('2020-06-15');
CREATE TABLE newsilly_20200615 PARTITION OF newsilly
  FOR VALUES FROM ('2020-06-15') TO ('2020-06-22');

At this point, the easiest way to get the data over to the new table is to just

INSERT INTO newsilly SELECT * FROM silly;

but that of course falls over if the tables are actually large.

So, logical replication to the rescue, right? Let's replicate this from the old to the new table!

postgres=# CREATE PUBLICATION silly FOR TABLE silly;
ERROR:  "silly" is a partitioned table
DETAIL:  Adding partitioned tables to publications is not supported.
HINT:  You can add the table partitions individually.

Oh, that's a bummer. Well, it turns out that PostgreSQL 13 changes this! So if I just sneakily swap out my PostgreSQL 12 for PostgreSQL 13, that command now succeeds:

postgres=# CREATE PUBLICATION silly FOR TABLE silly
postgres-#  WITH (publish_via_partition_root=true);
CREATE PUBLICATION

Notice the specific parameter publish_via_partition_root. If this parameter is set to true, as in this example, all updates will be published as if they came from silly. If it is set to false (the default), then the changes will look like they're coming from the individual partitions like silly_20200610 which will ruin our attempts at using this to re-partition.

Logical replication unfortunately requires us to replicate into a table with exactly the same name (fully qualified, so this includes the schema), so we cannot just replicate this into newsilly. Instead, we create a new database to replicate it into (this can be in a separate cluster, or it can be in the same cluster. If it's in the same cluster, care needs to be taken to manually create replication slots that are used, as the automatic creation will not work). In this new database, we create the target table that we want:

postgres=# CREATE DATABASE silly;
CREATE DATABASE
postgres=# \c silly
You are now connected to database "silly" as user "mha".
silly=# CREATE TABLE newsilly (
silly(#   t timestamptz NOT NULL,
silly(#   too_lazy_for_schema jsonb NOT NULL DEFAULT '{}'
silly(# )
silly-# PARTITION BY RANGE(t);
CREATE TABLE
silly=# CREATE TABLE newsilly_20200608 PARTITION OF newsilly FOR VALUES FROM ('2020-06-08') TO ('2020-06-15');
CREATE TABLE
silly=# CREATE TABLE newsilly_20200615 PARTITION OF newsilly FOR VALUES FROM ('2020-06-15') TO ('2020-06-22');
CREATE TABLE

And let's try to replicate into it:

silly=# CREATE SUBSCRIPTION silly CONNECTION 'port=5500 dbname=postgres'
silly-#   PUBLICATION silly;
ERROR:  relation "public.silly" does not exist

Oh, oops. There was that thing about names. OK, so let's just rename the table on the receiving side to match, and try again:

silly=# ALTER TABLE newsilly RENAME TO silly;
ALTER TABLE
silly=# CREATE SUBSCRIPTION silly CONNECTION 'port=5500 dbname=postgres'
silly-#   PUBLICATION silly;
CREATE SUBSCRIPTION
silly=# select count(*) from silly;
   144

Cool, that worked. But of course, now we are in the wrong database, and in a table with the wrong name.

The interesting thing to note here is that while we renamed the table, the individual partitions retain their previous name. Thus we can now set up a replication in the other direction, and this time use the default mode of replicating the individual partitions:

silly=# CREATE PUBLICATION newsilly
silly-#   FOR TABLE newsilly_20200608, newsilly_20200615;
CREATE PUBLICATION

And then over on our original database, we can subscribe to this publication, which will now subscribe the individual partitions:

postgres=# CREATE SUBSCRIPTION newsilly CONNECTION 'port=5500 dbname=silly'
postgres-#   PUBLICATION newsilly;
CREATE SUBSCRIPTION

And as if by magic:

postgres=# select count(*) from newsilly;
   144

And if we now try to add a new row:

postgres=# SELECT * FROM newsilly WHERE t='2020-06-10 19:32:00';

postgres=# INSERT INTO silly (t) VALUES ('2020-06-10 19:32:00');
INSERT 0 1
postgres=# SELECT * FROM newsilly WHERE t='2020-06-10 19:32:00';
 2020-06-10 19:32:00+02 | {}

So what happened to this row was:

Inserted to silly in database postgres
Partitioning code routed this into the partition silly_20200610
Logical replication publication published this as an insert in silly
Logical replication subscription reinserted this row into silly in database silly
Partitioning code in silly routed this into the partition newsilly_20200608
Logical replication publication published this as an insert in newsilly_20200608
Logical replication subscription reinserted this row into newsilly_20200608 in database postgres
Querying newsilly in postgres found this row by looking into the partition newsilly_20200608.

How's that for abusing a feature to solve a different problem!

Comments

Nice, and an excellent way to demonstrate the refinement in pg13

Posted on Jun 11, 2020 at 20:40 by glyn.

why not use triggers?

Posted on Jun 23, 2020 at 21:13 by Eshk.

Triggers are definitely also an option when you're doing it locally.

Posted on Jun 26, 2020 at 17:07 by Magnus.

Add comment

New comments can no longer be posted on this entry.