After giving my presentation at ConFoo this year, I had some discussions with a few people about the ability to put constraints on JSON data, and whether any of the advanced PostgreSQL constraints work for that. Or in short, can we get the benefits from both SQL and NoSQL at the same time?
My general response to questions like this when it comes to PostgreSQL is "if you think there's a chance it works, it probably does", and it turns out that applies in this case as well.
For things like UNIQUE keys and CHECK constraints it's fairly trivial, but there are also things like EXCLUSION constraints where there are some special constructs that need to be considered.
Other than the technical side of things, it's of course also a question of "should we do this". The more constraints that are added to the JSON data, the less "schemaless" it is. On the other hand, other databases that have schemaless/dynamic schema as their main selling points, but still require per-key indexes and constraints (unlike PostgreSQL where JSONB is actually schemaless even when indexed).
Anyway, back on topic. Keys and constraints on JSON data.
In PostgreSQL, keys and constraints can be defined on both regular columns and directly on any expression, as long as that expression is immutable (meaning that the output is only ever dependent on the input, and not on any outside state). And this functionality works very well with JSONB as well.
So let's start with a standard JSONB table:
postgres=# CREATE TABLE jsontable (j jsonb NOT NULL);
CREATE TABLE
postgres=# CREATE INDEX j_idx ON jsontable USING gin(j jsonb_path_ops);
CREATE INDEX
Of course, declaring a table like this is very seldom a good idea in reality - a single table with just a JSONB field. You probably know more about your data than that, so there will be other fields in the table than just the JSONB field. But this table will suffice for our example.
A standard gin index using jsonb_path_ops is how we get fully schemaless indexing in jsonb with maximum performance. We're not actually going to use this index in the examples below this time, but in real deployments it's likely one of the main reasons to use JSONB in the first place.
To illustrate the constraints, let's add some data representing some sort of bookings. Yes, this would be much better represented as relational, but for the sake of example we'll use JSON with a semi-fixed schema. We'll also use a uuid in the JSON data as some sort of key, as this is fairly common in these scenarios.
postgres=# INSERT INTO jsontable (j) VALUES ($${
"uuid": "4e9cf085-09a5-4b4f-bc99-bde2d2d51f41",
"start": "2015-03-08 10:00",
"end": "2015-03-08 11:00",
"title": "test"
}$$);
INSERT 0 1