Re: Compressing the AFTER TRIGGER queue

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Compressing the AFTER TRIGGER queue
Date: 2011-08-02 12:09:11
Message-ID: CA+U5nMKH_btFYHiBM9EP1MY-cJZx5z14Q5zJ3S5Yt2Ry5EcV0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 2, 2011 at 12:28 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 1 August 2011 21:02, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> I would prefer an approach where we store the events in a buffer,
>> which gets added to the main event queue when it fills/block number
>> changes/etc. That way we can apply intelligence to the actual
>> compression format used, yet retain all required info in the queued
>> event. We might also be able to ensure the event queue is ordered more
>> naturally by block number, to ensure we have reasonable readahead
>> while re-traversing the event queue - its not just the size of the
>> event queue that is the problem here.
>>
>
> I tried a similar approach before
> (http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php)
> but one of the problems was that, for user-defined triggers at least,
> it was thought that the order of firing for the triggers needed to
> match the row update order, which is difficult to achieve with a
> bitmap.

Luckily, I think what I said 2 years ago was correct.
http://archives.postgresql.org/pgsql-hackers/2009-10/msg01528.php

We just need a way to say that trigger execution order is not
important, which we know to be true for FK checks - or any STABLE
function.

I think that label is "STABLE" so we don't even need to invent new
function labels.

>> The cases we have problems with are the bulk cases, so we should be
>> specifically optimising for that. For example, if we are running a
>> COPY/INSERT SELECT or a seqscan DELETE then we will queue a trigger
>> for every tuple in a block.
>
> Agreed. That's what I tried to target with my code too - so it
> achieves its best compression with sequential data.

Maybe, but you're still storing 1 byte per row at least, often more.
Whereas 1 bit would be much better compression.

>  The best compression and flexibility in
>> that case is to store a bitmap since that will average out at about 1
>> bit per row, with variable length bitmaps. Which is about 8 times
>> better compression ratio than originally suggested, without any loss
>> of metadata.
>
> Yeah that's probably possible in specific cases, but I'm still not
> sure how to make it meet the full requirements of the after trigger
> queue.

I think you'd better explain what use case you are trying to optimise
for. It seems unlikely that you will come up with a compression scheme
that will fit all cases.

The only cases that seem a problem to me are
* bulk RI checks
* large writes on tables using trigger based replication
maybe you have others?

If you are trying to optimise trigger based replication, I would be
inclined to look at immediate execution triggers. If we insert into a
log table then that will only take effect if the transaction commits.
That way we would just bypass the after trigger queue entirely. I see
nothing in the SQL Standard that prevents that.

Bitmaps work better than the current suggestion for optimising bulk RI
checks. If you used immediate execution triggers you could optimise
bulk RI checks against small tables even further by just storing a
local hash table of already acquired row locks, again avoiding the
need to build up a huge after trigger event queue in the first place.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-08-02 12:18:57 Re: Hot standby and GiST page splits (was Re: WIP: Fast GiST index build)
Previous Message Heikki Linnakangas 2011-08-02 11:43:31 Re: Hot standby and GiST page splits (was Re: WIP: Fast GiST index build)