Re: BUG #13484: Performance problem with logical decoding

From: olivier(dot)gosseaume(at)free(dot)fr
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13484: Performance problem with logical decoding
Date: 2015-07-06 17:29:30
Message-ID: 1218457515.16345590.1436203770840.JavaMail.root@zimbra72-e12.priv.proxad.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Just after having submitted the bug report, i saw that when a transaction have 4096 or more operations (inserts for example), I see postgres generating files in pg_repslot/my_slot directory. Is that what you call "spilling to disk" ? BTW my hard disk is a SSD

To be more concise :
- ONE transaction with 4095 operations -> consumed in 80mS
- TEN transactions with 4095 operations each (so 40950 operations) -> 380mS (which as you said is very good -less than linear growth-)
- ONE transaction with 4096 operations -> consumed in 4204mS (ouch ...). I confirm there are only 4096 changes in one transaction
- TEN transactions with 4095 operations each (so 40950 operations) -> 34998mS, ouch again

I know the SQL interface is not the best thing but it matches my use case. Let me explain.

I'm currently evaluating postgres for my company (a quite big one) as a mixed JSON/SQL storage + notification system based on data change feeding a HTTP infrastructure. We use .NET client to communicate with database.

I've evaluated postgres against Oracle 12c (we have been using Oracle for 18 years), SQL Server, MongoDB (not ACID which is a real pain), OrientDB (promising but unreliable) and RethinkDB. The JSON implementation of postgres is in my opinion very very good, cleverly made and have good performance (much better than Oracle in many cases). I'm very impressed overall by everything i have tried in postgres.

I need some sort of change data capture mechanism (CDC) to detect changes on data (async is ok for me as long as i get very low latency). I could use triggers but performance would suffer. Reading the WAL log seems like it could be a good approach.

The current "simple" interface to consume a slot is thru this SQL interface, that's why i used it while prototyping things.

I also tried pg_recvlogical feeding a file, but got the exact same results so i'm stuck. What is observe is that the spilling occurs, and when the .snap file is created then pg_recvlogical will consume data but it does take a long time exactly the same time as pg_logical_slot_get_changes in fact.

BTW there is also a bug in pg_recvlogical with option -f - (output to stdout), pg_recvlogical tries to flush with fsync which does not work on windows and display an error message.

Best regards,
Olivier

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2015-07-06 20:41:38 Re: BUG #13484: Performance problem with logical decoding
Previous Message Heikki Linnakangas 2015-07-06 17:13:46 Re: PQexec() hangs on OOM