From: | Filip Rembiałkowski <filip(dot)rembialkowski(at)gmail(dot)com> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, postgres performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: bad COPY performance with NOTIFY in a trigger |
Date: | 2016-02-06 12:03:53 |
Message-ID: | CAP_rwwk7revDKQVyP--e4ph9fhGkzvJ8ZSBoN+hctzFkL7VL6Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks for the feedback.
This patch is my first and obvious approach.
@Merlin, I'm not sure if I get your idea:
- keep previous behaviour as obligatory? (which is: automatic
de-duplicating of incoming messages by channel+payload),
- instead of trivial search (sorting by browsing) use some kind of
faster lookups?
I'm not sure if this statement in async.c is carved in stone:
* Duplicate notifications from the same transaction are sent out as one
* notification only. This is done to save work when for example a trigger
* on a 2 million row table fires a notification for each row that has been
* changed. If the application needs to receive every single notification
* that has been sent, it can easily add some unique string into the extra
* payload parameter.
1) "work-saving" is disputable in some cases
2) an idea to "add some unique string" is OK logical-wise but it's not
OK performance-wise.
Current search code is a sequential search:
https://github.com/filiprem/postgres/blob/master/src/backend/commands/async.c#L2139
I'm not that smart to devise an algorithm for faster lookups -
probably you guys can give some advice.
Again, my rationale is... This feature can burn a lot of CPU for
nothing. I was hoping to use NOTIFY/LISTEN as superfast notification
mechanism. Superfast regardless on whether you insert 100, 10k or 1m
rows.
On Fri, Feb 5, 2016 at 8:52 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Fri, Feb 5, 2016 at 9:33 AM, Filip Rembiałkowski
> <filip(dot)rembialkowski(at)gmail(dot)com> wrote:
>> patch submitted on -hackers list.
>> http://www.postgresql.org/message-id/CAP_rwwn2z0gPOn8GuQ3qDVS5+HgEcG2EzEOyiJtcA=vpDEhoCg@mail.gmail.com
>>
>> results after the patch:
>>
>> trigger= BEGIN RETURN NULL; END
>> rows=40000
>> 228ms COPY test.tab FROM '/tmp/test.dat'
>> 205ms COPY test.tab FROM '/tmp/test.dat'
>> rows=80000
>> 494ms COPY test.tab FROM '/tmp/test.dat'
>> 395ms COPY test.tab FROM '/tmp/test.dat'
>> rows=120000
>> 678ms COPY test.tab FROM '/tmp/test.dat'
>> 652ms COPY test.tab FROM '/tmp/test.dat'
>> rows=160000
>> 956ms COPY test.tab FROM '/tmp/test.dat'
>> 822ms COPY test.tab FROM '/tmp/test.dat'
>> rows=200000
>> 1184ms COPY test.tab FROM '/tmp/test.dat'
>> 1072ms COPY test.tab FROM '/tmp/test.dat'
>> trigger= BEGIN PERFORM pg_notify('test',NEW.id::text); RETURN NULL; END
>> rows=40000
>> 440ms COPY test.tab FROM '/tmp/test.dat'
>> 406ms COPY test.tab FROM '/tmp/test.dat'
>> rows=80000
>> 887ms COPY test.tab FROM '/tmp/test.dat'
>> 769ms COPY test.tab FROM '/tmp/test.dat'
>> rows=120000
>> 1346ms COPY test.tab FROM '/tmp/test.dat'
>> 1171ms COPY test.tab FROM '/tmp/test.dat'
>> rows=160000
>> 1710ms COPY test.tab FROM '/tmp/test.dat'
>> 1709ms COPY test.tab FROM '/tmp/test.dat'
>> rows=200000
>> 2189ms COPY test.tab FROM '/tmp/test.dat'
>> 2206ms COPY test.tab FROM '/tmp/test.dat'
>
> I'm not so sure that this is a great idea. Generally, we tend to
> discourage GUCs that control behavior at the SQL level. Are you 100%
> certain that there is no path to optimizing this case without changing
> behvior?
>
> merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Hedayat Vatankhah | 2016-02-08 08:50:49 | Re: PostgreSQL seems to create inefficient plans in simple conditional joins |
Previous Message | Merlin Moncure | 2016-02-05 19:52:51 | Re: bad COPY performance with NOTIFY in a trigger |