JSON decoding plugin

From: Euler Taveira <euler(at)timbira(dot)com(dot)br>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: JSON decoding plugin
Date: 2013-12-09 13:03:42
Message-ID: 52A5BFAE.1040209@timbira.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

A few months ago, it was proposed [1] that would be interested to have a
json output plugin for logical decoding. Here it is.

Each transaction is a JSON object that can contain xid (optional),
timestamp (optional), and change array. Each change's element is a
command that was decoded and it can contains: kind (I/U/D), schema
(optional), table, columnnames, columntypes (optional), columnvalues,
and oldkeys (only for U/D). columnnames, columntypes and columnvalues
are arrays. oldkeys is an object that contains the following arrays:
keynames, keytypes (optional), and keyvalues.

The JSON objects are serialized if you are decoding a serie of
transactions. Here is an output example:

{
"xid": 702,
"change": [
{
"kind": "insert",
"schema": "public",
"table": "foo",
"columnnames": ["a", "b", "c"],
"columntypes": ["int4", "int4", "text"],
"columnvalues": [1, 2, "test"]
}
,{
"kind": "update",
"schema": "public",
"table": "foo",
"columnnames": ["a", "b", "c"],
"columntypes": ["int4", "int4", "text"],
"columnvalues": [1, 2, "test2"],
"oldkeys": {
"keynames": ["a", "b"],
"keytypes": ["int4", "int4"],
"keyvalues": [1, 2]
}
}
]
}
{
"xid": 703,
"change": [
{
"kind": "update",
"schema": "public",
"table": "foo",
"columnnames": ["a", "b", "c"],
"columntypes": ["int4", "int4", "text"],
"columnvalues": [1, 3, "test2"],
"oldkeys": {
"keynames": ["a", "b"],
"keytypes": ["int4", "int4"],
"keyvalues": [1, 2]
}
}
]
}
{
"xid": 704,
"change": [
{
"kind": "delete",
"schema": "public",
"table": "foo",
"oldkeys": {
"keynames": ["a", "b"],
"keytypes": ["int4", "int4"],
"keyvalues": [1, 3]
}
}
]
}

Some data types was adapted to conform with JSON spec. NAN and Infinity
are not valid JSON symbols so their representation is NULL (as some JSON
implementations). Due to JSON datatype simplicity, I represent the vast
majority of Postgres datatypes as string (However, I admit that we could
mimic the json datatype conversion rules).

The oldkeys treatment follows what was defined by the commit [2]. It uses:

(i) primary key (default behavior);
(ii) unique index (if REPLICA IDENTITY USING INDEX is defined for table);
(iii) full tuple (if REPLICA IDENTITY FULL is defined for table);
(iv) nothing means an error (if REPLICA IDENTITY NOTHING is defined for
table).

The TOAST columns have a special treatment for UPDATEs. If a tuple that
contains a TOAST field is updated, the TOAST field is included iif it is
changed too. It means that unchanged TOAST field are omitted from
columns* arrays. This means less overhead while transmitting,
processing and applying changes.

By design, (i) output plugin doesn't know about aborted transactions and
(ii) subtransactions are reordered into a toplevel transaction and only
the committed pieces are passed to the plugin.

You can test it firing the regression tests (e.g. 'make test') or using
the following steps?

postgresql.conf:
wal_level = logical
max_wal_senders = 2
max_logical_slots = 2

start collecting WAL records:

$ pg_recvlogical --slot=foo -d euler -f /dev/stdout
--plugin=json_decoding_plugin --init

[execute some transactions]

start printing decoded transactions:

$ pg_recvlogical --slot=foo -d euler -f /dev/stdout --start

stop collecting WAL records:

$ pg_recvlogical --slot=foo -d euler -f /dev/stdout --stop

Comments?

[1]
http://www.postgresql.org/message-id/20131015144848.GC8001@awork2.anarazel.de
[2]
http://www.postgresql.org/message-id/E1VepyF-0005gU-Tv@gemulon.postgresql.org

--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Attachment Content-Type Size
json-plugin-0.6.tgz application/x-compressed-tar 45.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2013-12-09 13:05:38 Re: Recovery to backup point
Previous Message MauMau 2013-12-09 12:22:52 Re: [bug fix] pg_ctl always uses the same event source