Re: Re: logical changeset generation v4 - Heikki's thoughts about the patch state

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Phil Sorber <phil(at)omniti(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: logical changeset generation v4 - Heikki's thoughts about the patch state
Date: 2013-01-28 11:31:17
Message-ID: 20130128113117.GA22401@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-28 11:59:52 +0200, Heikki Linnakangas wrote:
> On 24.01.2013 00:30, Andres Freund wrote:
> >Also, while the apply side surely isn't benchmarkable without any being
> >submitted, the changeset generation can very well be benchmarked.
> >
> >A very, very adhoc benchmark:
> > -c max_wal_senders=10
> > -c max_logical_slots=10 --disabled for anything but logical
> > -c wal_level=logical --hot_standby for anything but logical
> > -c checkpoint_segments=100
> > -c log_checkpoints=on
> > -c shared_buffers=512MB
> > -c autovacuum=on
> > -c log_min_messages=notice
> > -c log_line_prefix='[%p %t] '
> > -c wal_keep_segments=100
> > -c fsync=off
> > -c synchronous_commit=off
> >
> >pgbench -p 5440 -h /tmp -n -M prepared -c 16 -j 16 -T 30
> >
> >pgbench upstream:
> >tps: 22275.941409
> >space overhead: 0%
> >pgbench logical-submitted
> >tps: 16274.603046
> >space overhead: 2.1%
> >pgbench logical-HEAD (will submit updated version tomorrow or so):
> >tps: 20853.341551
> >space overhead: 2.3%
> >pgbench single plpgsql trigger (INSERT INTO log(data) VALUES(NEW::text))
> >tps: 14101.349535
> >space overhead: 369%
> >
> >Note that in the single trigger case nobody consumed the queue while the
> >logical version streamed the changes out and stored them to disk.
>
> That makes the space overhead comparison completely worthless, no? I would
> expect the trigger-based approach to generate roughly 100% more WAL, not
> close to 400%. As long as the queue is drained constantly, there should be
> no big difference in the disk space used, except for the WAL.

Imo its a valid comparison as all such queues can only be drained in a
rather imperfect manner. I think these days all solutions use multiple
(two) queue tables and switch between those and truncate the non-active
one as vacuuming them works far too unreliable.
And those tables have to be plain logged once, so they matter in
checkpoints et al.

> >Adding a default NOW() or similar to the tables immediately makes
> >logical decoding faster by a factor of about 3 in comparison to the
> >above trivial trigger.
>
> Hmm, is that because of the conversion to text? I believe slony also
> converts all the values to text in the trigger, because that's simple and
> flexible, but if we're trying to compare the performance of logical
> changeset generation vs. trigger-based replication in general, we should
> choose the most efficient trigger-based scheme to compare with. That means,
> don't convert to text. And write the trigger in C.

Imo its basically impossible for the current queue-based solutions not
to convert to text because they otherwise would need to queue all the
conversion information as well. And the the test_decoding plugin also
converts everything to text, so thats a fair comparison from that
POV. In fact the test_decoding plugin does noticeably more as it also
outputs table, column and type name.

I aggree on the C argument. I really doubt its going to make that much
of a difference but we should try it.
In my experience a plpgsql trigger that just does a straight conversion
via cast is still noticeably faster than any of the "real" replication
triggers out there though, so I wouldn't expect much there.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Tiikkaja 2013-01-28 11:31:27 Re: pg_dump --pretty-print-views
Previous Message Andres Freund 2013-01-28 11:23:02 Re: logical changeset generation v4