Re: [HACKERS] [POC] Faster processing at Gather node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [POC] Faster processing at Gather node
Date: 2017-11-26 08:15:32
Message-ID: CAA4eK1Jox0+2_9=4N6i4dSkgFvUBg-W-nMmOpGnUe01WENfHXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 25, 2017 at 9:13 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Nov 22, 2017 at 8:36 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> remove-memory-leak-protection-v1.patch removes the memory leak
>>> protection that Tom installed upon discovering that the original
>>> version of tqueue.c leaked memory like crazy. I think that it
>>> shouldn't do that any more, courtesy of
>>> 6b65a7fe62e129d5c2b85cd74d6a91d8f7564608. Assuming that's correct, we
>>> can avoid a whole lot of tuple copying in Gather Merge and a much more
>>> modest amount of overhead in Gather. Since my test case exercised
>>> Gather Merge, this bought ~400 ms or so.
>>
>> I think Tom didn't installed memory protection in nodeGatherMerge.c
>> related to an additional copy of tuple. I could see it is present in
>> the original commit of Gather Merge
>> (355d3993c53ed62c5b53d020648e4fbcfbf5f155). Tom just avoided applying
>> heap_copytuple to a null tuple in his commit
>> 04e9678614ec64ad9043174ac99a25b1dc45233a. I am not sure whether you
>> are just referring to the protection Tom added in nodeGather.c, If
>> so, it is not clear from your mail.
>
> Yes, that's what I mean. What got done for Gather Merge was motivated
> by what Tom did for Gather. Sorry for not expressing the idea more
> precisely.
>
>> I think we don't need the additional copy of tuple in
>> nodeGatherMerge.c and your patch seem to be doing the right thing.
>> However, after your changes, it looks slightly odd that
>> gather_merge_clear_tuples() is explicitly calling heap_freetuple for
>> the tuples allocated by tqueue.c, maybe we can add some comment. It
>> was much clear before this patch as nodeGatherMerge.c was explicitly
>> copying the tuples that it is freeing.
>
> OK, I can add a comment about that.
>

Sure, I think apart from that the patch looks good to me. I think a
good test of this patch could be to try to pass many tuples via gather
merge and see if there is any leak in memory. Do you have any other
ideas?

>> Isn't it better to explicitly call gather_merge_clear_tuples in
>> ExecEndGatherMerge so that we can free the memory for tuples allocated
>> in this node rather than relying on reset/free of MemoryContext in
>> which those tuples are allocated?
>
> Generally relying on reset/free of a MemoryContext is cheaper.
> Typically you only want to free manually if the freeing would
> otherwise not happen soon enough.
>

Yeah and I think something like that can happen after your patch
because now the memory for tuples returned via TupleQueueReaderNext
will be allocated in ExecutorState and that can last for long. I
think it is better to free memory, but we can leave it as well if you
don't feel it important. In any case, I have written a patch, see if
you think it makes sense.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
release_memory_at_gather_merge_shutdown_v1.patch application/octet-stream 964 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-11-26 10:12:09 Re: [HACKERS] More stats about skipped vacuums
Previous Message Michael Paquier 2017-11-26 07:40:16 Re: Code cleanup patch submission for extended_stats.c