From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Tels <nospam-pg-abuse(at)bloodgate(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Aggregates for string_agg and array_agg |
Date: | 2018-04-05 20:46:26 |
Message-ID: | fdbf52dc-2e80-f5bc-5d43-b66a2deba021@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 04/05/2018 09:10 PM, Tels wrote:
> Moin,
>
> On Wed, April 4, 2018 11:41 pm, David Rowley wrote:
>> Hi Tomas,
>>
>> Thanks for taking another look.
>>
>> On 5 April 2018 at 07:12, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
>> wrote:
>>> Other than that, the patch seems fine to me, and it's already marked as
>>> RFC so I'll leave it at that.
>>
>> Thanks.
>
> I have one more comment - sorry for not writing sooner, the flu got to me ...
>
> Somewhere in the code there is a new allocation of memory when the string
> grows beyond the current size - and that doubles the size. This can lead
> to a lot of wasted space (think: constructing a string that is a bit over
> 1 Gbyte, which would presumable allocate 2 GByte).
>
I don't think we support memory chunks above 1GB, so that's likely going
to fail anyway. See
#define MaxAllocSize ((Size) 0x3fffffff) /* 1 gigabyte - 1 */
#define AllocSizeIsValid(size) ((Size) (size) <= MaxAllocSize)
But I get your point - we may be wasting space here. But that's hardly
something this patch should mess with - that's a more generic allocation
question.
> The same issue happens when each worker allocated 512 MByte for a 256
> Mbyte + 1 result.
>
> IMHO a factor of like 1.4 or 1.2 would work better here - not sure what
> the current standard in situations like this in PG is.
>
With a 2x scale factor, we only waste 25% of the space on average.
Consider that you're growing because you've reached the current size,
and you double the size - say, from 1MB to 2MB. But the 1MB wasted space
is the worst case - in reality we'll use something between 1MB and 2MB,
so 1.5MB on average. At which point we've wasted just 0.5MB, i.e. 25%.
That sounds perfectly reasonable to me. Lower factor would be more
expensive in terms of repalloc, for example.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-04-05 20:47:28 | Re: WIP: a way forward on bootstrap data |
Previous Message | John Naylor | 2018-04-05 20:35:23 | Re: WIP: a way forward on bootstrap data |