Viewing entries tagged with indexing. Return to full view.

JSON field constraints

After giving my presentation at ConFoo this year, I had some discussions with a few people about the ability to put constraints on JSON data, and whether any of the advanced PostgreSQL constraints work for that. Or in short, can we get the benefits from both SQL and NoSQL at the same time?

My general response to questions like this when it comes to PostgreSQL is "if you think there's a chance it works, it probably does", and it turns out that applies in this case as well.

For things like UNIQUE keys and CHECK constraints it's fairly trivial, but there are also things like EXCLUSION constraints where there are some special constructs that need to be considered.

Other than the technical side of things, it's of course also a question of "should we do this". The more constraints that are added to the JSON data, the less "schemaless" it is. On the other hand, other databases that have schemaless/dynamic schema as their main selling points, but still require per-key indexes and constraints (unlike PostgreSQL where JSONB is actually schemaless even when indexed).

Anyway, back on topic. Keys and constraints on JSON data.

In PostgreSQL, keys and constraints can be defined on both regular columns and directly on any expression, as long as that expression is immutable (meaning that the output is only ever dependent on the input, and not on any outside state). And this functionality works very well with JSONB as well.

So let's start with a standard JSONB table:

postgres=# CREATE TABLE jsontable (j jsonb NOT NULL);
CREATE TABLE

postgres=# CREATE INDEX j_idx ON jsontable USING gin(j jsonb_path_ops);
CREATE INDEX

Of course, declaring a table like this is very seldom a good idea in reality - a single table with just a JSONB field. You probably know more about your data than that, so there will be other fields in the table than just the JSONB field. But this table will suffice for our example.

A standard gin index using jsonb_path_ops is how we get fully schemaless indexing in jsonb with maximum performance. We're not actually going to use this index in the examples below this time, but in real deployments it's likely one of the main reasons to use JSONB in the first place.

To illustrate the constraints, let's add some data representing some sort of bookings. Yes, this would be much better represented as relational, but for the sake of example we'll use JSON with a semi-fixed schema. We'll also use a uuid in the JSON data as some sort of key, as this is fairly common in these scenarios.

postgres=# INSERT INTO jsontable (j) VALUES ($${
  "uuid": "4e9cf085-09a5-4b4f-bc99-bde2d2d51f41",
  "start": "2015-03-08 10:00",
  "end": "2015-03-08 11:00",
  "title": "test"
}$$);
INSERT 0 1

Continue reading

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

PGDay Chicago 2024
Apr 26, 2024
Chicago, USA
PGConf.DEV 2024
May 28-31, 2024
Vancouver, Canada

Past

SCaLE 2024
Mar 14-17, 2024
Pasadena, USA
Nordic PGDay 2024
Mar 12, 2024
Oslo, Norway
FOSDEM PGDay 2024
Feb 2-4, 2024
Brussels, Belgium
PGConf.EU 2023
Dec 12-15, 2023
Prague, Czechia
PGConf.NYC 2023
Oct 3-5, 2023
New York, USA
More past conferences