JSONB columns in postgres are cool. You can add arbitrary, structured data to rows without needing to update your schema.
-- Example of the sort of thing you can do with JSONB columns
=> CREATE TABLE players ( id SERIAL PRIMARY KEY , data JSONB );
=> INSERT INTO players ( data ) VALUES ( '{"name": "PlayerA"}' );
=> UPDATE players SET data = jsonb_set ( data , '{health}' , '0.5' ) WHERE data ->> 'name' = 'PlayerA' ;
=> INSERT INTO players ( data ) VALUES ( '{"name": "PlayerB", "health": 0.4}' );
=> SELECT * FROM players ORDER BY ( data ->> 'health' ):: float ;
id | data
----+------------------------------------
2 | { "name" : "PlayerB" , "health" : 0 . 4 }
1 | { "name" : "PlayerA" , "health" : 0 . 5 }
( 2 rows )
Obviously, you’re going to want to index some of these JSON keys eventually. When you do, look out for unnecessary casts which can kill your query performance:
-- (BAD) Each column in the index is cast to a known type
=> CREATE INDEX rankings_query ON rankings ( (( data ->> 'competition' )), (( data ->> 'caught_cheating' ):: boolean ), (( data ->> 'ranking' ):: float ) );
=> EXPLAIN ANALYZE SELECT * FROM rankings WHERE data ->> 'competition' = 'doom' AND ( data ->> 'caught_cheating' ):: boolean = false ORDER BY ( data ->> 'ranking' ):: float ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using rankings_query on rankings ( cost = 0 . 14 .. 8 . 17 rows = 1 width = 44 ) ( actual time = 0 . 018 .. 0 . 023 rows = 9 loops = 1 )
Index Cond : ((( data ->> 'competition' :: text ) = 'doom' :: text ) AND ((( data ->> 'caught_cheating' :: text )):: boolean = false ))
Filter : ( NOT (( data ->> 'caught_cheating' :: text )):: boolean ) -- <-- HERE
Planning time : 0 . 147 ms
Execution time : 0 . 039 ms
( 5 rows )
-- (GOOD) Each column in the index is left as type JSONB
=> CREATE INDEX rankings_query ON rankings ( (( data -> 'competition' )), (( data -> 'caught_cheating' )), (( data -> 'ranking' )) );
=> EXPLAIN ANALYZE SELECT * FROM rankings WHERE data -> 'competition' = '"doom"' AND data -> 'caught_cheating' = 'false' ORDER BY data -> 'ranking' ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using rankings_query on rankings ( cost = 0 . 14 .. 8 . 16 rows = 1 width = 68 ) ( actual time = 0 . 019 .. 0 . 022 rows = 9 loops = 1 )
Index Cond : ((( data -> 'competition' :: text ) = '"doom"' :: jsonb ) AND (( data -> 'caught_cheating' :: text ) = 'false' :: jsonb ))
Planning time : 0 . 137 ms
Execution time : 0 . 038 ms
( 4 rows )
Here are two analyses on similar queries in a production database with a lot more data (note the “Execution time”):
-- Indexed by casting to a known type
=> explain analyze select count ( * ) from my_table where v ->> 'my_field1' = 'foo' and ( v ->> 'my_field2' ):: boolean = true ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 108247 . 06 .. 108247 . 07 rows = 1 width = 8 ) ( actual time = 33289 . 910 .. 33289 . 911 rows = 1 loops = 1 )
-> Bitmap Heap Scan on my_table ( cost = 40025 . 16 .. 104347 . 37 rows = 1559876 width = 0 ) ( actual time = 1685 . 160 .. 32754 . 429 rows = 1559876 loops = 1 )
Recheck Cond : (( v ->> 'my_field1' :: text ) = 'foo' :: text )
Filter : (( v ->> 'my_field2' :: text )):: boolean
Heap Blocks : exact = 29176
-> Bitmap Index Scan on my_index ( cost = 0 . 00 .. 39635 . 19 rows = 1559876 width = 0 ) ( actual time = 1679 . 954 .. 1679 . 954 rows = 1559876 loops = 1 )
Index Cond : ((( v ->> 'my_field1' :: text ) = 'foo' :: text ) AND ((( v ->> 'my_field2' :: text )):: boolean = true ))
Planning time : 0 . 142 ms
Execution time : 33289 . 971 ms
( 9 rows )
-- Indexed by leaving as JSONB
=> explain analyze select count ( * ) from my_table where v -> 'my_field1' = '"foo"' and v -> 'my_field2' = 'true' ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 157 . 43 .. 157 . 44 rows = 1 width = 8 ) ( actual time = 1329 . 478 .. 1329 . 479 rows = 1 loops = 1 )
-> Bitmap Heap Scan on my_table ( cost = 4 . 83 .. 157 . 33 rows = 39 width = 0 ) ( actual time = 518 . 394 .. 1002 . 934 rows = 1559876 loops = 1 )
Recheck Cond : ((( v -> 'my_field1' :: text ) = '"foo"' :: jsonb ) AND (( v -> 'my_field2' :: text ) = 'true' :: jsonb ))
Heap Blocks : exact = 29176
-> Bitmap Index Scan on my_index ( cost = 0 . 00 .. 4 . 82 rows = 39 width = 0 ) ( actual time = 513 . 360 .. 513 . 360 rows = 1559876 loops = 1 )
Index Cond : ((( v -> 'my_field1' :: text ) = '"foo"' :: jsonb ) AND (( v -> 'my_field2' :: text ) = 'true' :: jsonb ))
Planning time : 0 . 218 ms
Execution time : 1329 . 513 ms
( 8 rows )