From: | Paul Guo <pguo(at)pivotal(dot)io> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Taylor Vesely <tvesely(at)pivotal(dot)io>, Michael Paquier <michael(at)paquier(dot)xyz> |
Subject: | Re: Batch insert in CTAS/MatView code |
Date: | 2019-03-10 13:55:35 |
Message-ID: | CAEET0ZEnwWpx98qfnf5w12_AxnRdCnWH09rFprx8bc-qc9-UAA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Sorry for the late reply.
To Michael. Thank you. I know this commitfest is ongoing and I'm not
targeting for this.
On Thu, Mar 7, 2019 at 4:54 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 06/03/2019 22:06, Paul Guo wrote:
> > The patch also modifies heap_multi_insert() a bit to do a bit further
> > code-level optimization by using static memory, instead of using memory
> > context and dynamic allocation.
>
> If toasting is required, heap_prepare_insert() creates a palloc'd tuple.
> That is still leaked to the current memory context.
>
>
Thanks. I checked the code for that but apparently, I missed that one. I'll
see what proper context can be used for CTAS. For copy code maybe just
revert my change.
> Leaking into the current memory context is not a bad thing, because
> resetting a memory context is faster than doing a lot of pfree() calls.
> The callers just need to be prepared for that, and use a short-lived
> memory context.
>
> > By the way, while looking at the code, I noticed that there are 9 local
> > arrays with large length in toast_insert_or_update() which seems to be a
> > risk of stack overflow. Maybe we should put it as static or global.
>
> Hmm. We currently reserve 512 kB between the kernel's limit, and the
> limit we check in check_stack_depth(). See STACK_DEPTH_SLOP. Those
> arrays add up to 52800 bytes on a 64-bit maching, if I did my math
> right. So there's still a lot of headroom. I agree that it nevertheless
> seems a bit excessive, though.
>
I was worried about some recursive calling of it, but probably there should
be no worry for toast_insert_or_update().
> > With the patch,
> >
> > Time: 4728.142 ms (00:04.728)
> > Time: 14203.983 ms (00:14.204)
> > Time: 1008.669 ms (00:01.009)
> >
> > Baseline,
> > Time: 11096.146 ms (00:11.096)
> > Time: 13106.741 ms (00:13.107)
> > Time: 1100.174 ms (00:01.100)
>
> Nice speedup!
>
> > While for toast and large column size there is < 10% decrease but for
> > small column size the improvement is super good. Actually if I hardcode
> > the batch count as 4 all test cases are better but the improvement for
> > small column size is smaller than that with current patch. Pretty much
> > the number 4 is quite case specific so I can not hardcode that in the
> > patch. Of course we could further tune that but the current value seems
> > to be a good trade-off?
>
> Have you done any profiling, on why the multi-insert is slower with
> large tuples? In principle, I don't see why it should be slower.
>
Thanks for the suggestion. I'll explore a bit more on this.
>
> - Heikki
>
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2019-03-10 14:09:15 | Re: Making all nbtree entries unique by having heap TIDs participate in comparisons |
Previous Message | Ramanarayana | 2019-03-10 13:26:57 | Re: Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query; |