Til Postgre Json Jsonb
sql postgre jsonSeveral days ago, I heard a joke:
You don’t need Mongo, PostgreSQL is enough.
So… let’s try it!
Scopes
Before we move forward, let’s set the scope of this test. In my experience, the use cases where I used Mongo are:
-
Store raw information coming from request/response, e.g. webhook, API call, etc.
-
Store personalized config with JSON structure.
So we will try to use these cases using json and jsonb column in PostgreSQL.
JSON and JSONB
There are two types for storing JSON format in PostgreSQL: json and jsonb.
The difference based on my understanding:
-
JSON: Store the JSON as is, no normalization, no restructuring. Just store it like plain text.
-
JSONB: Store the JSON after normalize process — like remove whitespaces, restructure keys, and store it as binary format.
So i think:
-
JSON is a good case for audit log or storing raw data.
-
JSONB is a perfect match for personal config.
Playing Around with JSON and JSONB
First we will playing aroung at json column, i have several dummy records like this
user_id | logs
---------+----------------------------------------------------------------
102 | { +
| "after": {"email": "[email protected]", "name": "Alice B."},+
| "before": {"name": "Alice", "email": "[email protected]"} +
| }
15 | { +
| "before": {"address": "No. 123"}, +
| "after": {"address": "No. 123B"} +
| }
We will just playing around to handle searching with value of the json column..
example=# select user_id, logs -> 'before' ->> 'email' as email_before, logs -> 'after' ->> 'email' as after_email from user_logs;
user_id | email_before | after_email
---------+------------------+-------------------
1 | [email protected] | [email protected]
2 | [email protected] | [email protected]
3 | |
4 | |
5 | [email protected] | [email protected]
6 | |
7 | [email protected] | [email protected]
8 | |
9 | [email protected] | [email protected]
10 | [email protected] | [email protected]
11 | [email protected] | [email protected]
12 | [email protected] | [email protected]
13 | |
14 | [email protected] | [email protected]
15 | |
102 | [email protected] | [email protected]
(16 rows)
For this kind of example i used a lot operator ‘->’ and ‘-»’. ‘->’ This operator read value as json object ‘-»’ This operator convert the value as string so we can filter as normal field. Let try with filter
select user_id, logs -> 'before' ->> 'email' as email_before, logs -> 'after' ->> 'email' as after_email from user_logs
where logs -> 'after' ->> 'email' = '[email protected]';
user_id | email_before | after_email
---------+--------------+--------------
1 | [email protected] | [email protected]
(1 row)
JSONB
Like I said before, in JSONB, before the value is stored in the column, it goes through some kind of normalization.
-- insert
INSERT INTO user_config
(user_id,
config)
VALUES (5,
'{ "theme": "dark", "language": "id", "status": "free_user" }');
-- result
user_id | config
---------+-----------------------------------------------------------------
1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
2 | {"theme": "dark", "status": "premium_user", "language": "en"}
3 | {"theme": "dark", "status": "admin", "language": "id"}
4 | {"theme": "dark", "status": "premium_user", "language": "id"}
5 | {"theme": "solarized", "status": "free_user", "language": "id"}
See? Data on table little bit different with the query right?
Let’s play around with filters. Because in JSONB the data is stored in binary format, we can use the contains operator @>:
select * from user_config where config @> '{"theme":"dark"}' limit 5;
user_id | config
---------+---------------------------------------------------------------
1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
2 | {"theme": "dark", "status": "premium_user", "language": "en"}
3 | {"theme": "dark", "status": "admin", "language": "id"}
4 | {"theme": "dark", "status": "premium_user", "language": "id"}
7 | {"theme": "dark", "status": "free_user", "language": "id"}
Or the exist operator ?:
example=# select * from user_config where config ? 'notification' limit 5;
user_id | config
---------+--------
(0 rows)
example=# select * from user_config where config ? 'language' limit 5;
user_id | config
---------+-----------------------------------------------------------------
1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
2 | {"theme": "dark", "status": "premium_user", "language": "en"}
3 | {"theme": "dark", "status": "admin", "language": "id"}
4 | {"theme": "dark", "status": "premium_user", "language": "id"}
5 | {"theme": "solarized", "status": "free_user", "language": "id"}
(5 rows)
But that doesn’t mean we can’t use the previous operators (->, -»).
And the good part about JSONB: the column can be indexed. I have 100k records. Before I put the index:
explain analyze select user_id, config from user_config where config @> '{"theme":"dark"}';
Seq Scan on user_config
(cost=0.00..2574.50 rows=35199 width=68)
(actual time=0.027..46.364 rows=34805 loops=1)
Filter: (config @> '{"theme": "dark"}'::jsonb)
Rows Removed by Filter: 69395
Execution Time: 48.419 ms
Then I added the index:
CREATE INDEX idx_user_config_config_jsonb ON user_config USING gin (config);
The result is good:
explain analyze select user_id, config from user_config where config @> '{"theme":"dark"}';
Bitmap Heap Scan on user_config
(cost=263.71..1975.70 rows=35199 width=68)
(actual time=12.631..32.547 rows=34805 loops=1)
Recheck Cond: (config @> '{"theme": "dark"}'::jsonb)
Heap Blocks: exact=1272
-> Bitmap Index Scan on idx_user_config_config_jsonb
(cost=0.00..254.91 rows=35199 width=0)
(actual time=12.393..12.394 rows=34805 loops=1)
Execution Time: 34.529 ms
There are many things I still need to learn about this column, but for now, it’s enough for me to say: I think for my use case, I don’t need MongoDB.