Re: [HACKERS] [POC] Faster processing at Gather node

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [POC] Faster processing at Gather node
Date: 2017-11-14 12:31:47
Message-ID: CAOGQiiP2T4O3Z7ivJtq-cTNbKwQdugNmqOt7YE6jF=NC4_rC=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 10, 2017 at 8:39 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Nov 10, 2017 at 5:44 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> I am seeing the assertion failure as below on executing the above
>> mentioned Create statement:
>>
>> TRAP: FailedAssertion("!(!(tup->t_data->t_infomask & 0x0008))", File:
>> "heapam.c", Line: 2634)
>> server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>
> OK, I see it now. Not sure why I couldn't reproduce this before.
>
> I think the problem is not actually with the code that I just wrote.
> What I'm seeing is that the slot descriptor's tdhasoid value is false
> for both the funnel slot and the result slot; therefore, we conclude
> that no projection is needed to remove the OIDs. That seems to make
> sense: if the funnel slot doesn't have OIDs and the result slot
> doesn't have OIDs either, then we don't need to remove them.
> Unfortunately, even though the funnel slot descriptor is marked
> tdhashoid = false, the tuples being stored there actually do have
> OIDs. And that is because they are coming from the underlying
> sequential scan, which *also* has OIDs despite the fact that tdhasoid
> for it's slot is false.
>
> This had me really confused until I realized that there are two
> processes involved. The problem is that we don't pass eflags down to
> the child process -- so in the user backend, everybody agrees that
> there shouldn't be OIDs anywhere, because EXEC_FLAG_WITHOUT_OIDS is
> set. In the parallel worker, however, it's not set, so the worker
> feels free to do whatever comes naturally, and in this test case that
> happens to be returning tuples with OIDs. Patch for this attached.
>
> I also noticed that the code that initializes the funnel slot is using
> its own PlanState rather than the outer plan's PlanState to call
> ExecContextForcesOids. I think that's formally incorrect, because the
> goal is to end up with a slot that is the same as the outer plan's
> slot. It doesn't matter because ExecContextForcesOids doesn't care
> which PlanState it gets passed, but the comments in
> ExecContextForcesOids imply that somebody it might, so perhaps it's
> best to clean that up. Patch for this attached, too.
>
> And here are the other patches again, too.
>
I tested this patch on TPC-H benchmark queries and here are the details.
Setup:
commit: 42de8a0255c2509bf179205e94b9d65f9d6f3cf9
TPC-H scale factor = 20
work_mem = 1GB
max_parallel_workers_per_gather = 4
random_page_cost = seq_page_cost = 0.1

Results:
Case 1: patches applied = skip-project-gather_v1 +
shm-mq-reduce-receiver-latch-set-v1 + shm-mq-less-spinlocks-v2 +
remove-memory-leak-protection-v1
No change in execution time performance for any of the 22 queries.

Case 2: patches applied as in case 1 +
a) increased PARALLEL_TUPLE_QUEUE_SIZE to 655360
No significant change in performance in any query
b) increased PARALLEL_TUPLE_QUEUE_SIZE to 65536 * 50
Performance improved from 20s to 11s for Q12
c) increased PARALLEL_TUPLE_QUEUE_SIZE to 6553600
Q12 shows improvement in performance from 20s to 7s

Case 3: patch applied = faster_gather_v3 as posted at [1]
Q12 shows improvement in performance from 20s to 8s

Please find the attached file for the explain analyse outputs in all
of the aforementioned cases.
I am next working on analysing the effect of these patches on gather
performance in other cases.

[1] https://www.postgresql.org/message-id/CAOGQiiMOWJwfaegpERkvv3t6tY2CBdnhWHWi1iCfuMsCC98a4g%40mail.gmail.com
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
gather_speedup_test.zip application/zip 4.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-11-14 12:42:54 Re: [HACKERS] Proposal: Local indexes for partitioned table
Previous Message Alvaro Herrera 2017-11-14 12:30:28 Re: [HACKERS] Proposal: Local indexes for partitioned table