From: | Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: POC: GROUP BY optimization |
Date: | 2018-06-29 21:19:13 |
Message-ID: | 0b3ee377-2f67-f58d-6177-bc2ed4626499@archidevsys.co.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 30/06/18 03:03, Tomas Vondra wrote:
> On 06/29/2018 04:51 PM, Teodor Sigaev wrote:
>>
>>>> I tried to attack the cost_sort() issues and hope on that basis we
>>>> can solve problems with 0002 patch and improve incremental sort patch.
>>>>
>>>
>>> OK, will do. Thanks for working on this!
>>
>> I hope, now we have a better cost_sort(). The obvious way is a try
>> all combination of pathkeys in get_cheapest_group_keys_order() and
>> choose cheapest one by cost_sort().
>
>> But it requires N! operations and potentially could be very
>> expensive in case of large number of pathkeys and doesn't solve the
>> issue with user-knows-what-he-does pathkeys.
>
> Not sure. There are N! combinations, but this seems like a good
> candidate for backtracking [1]. You don't have to enumerate and
> evaluate all N! combinations, just construct one and then abandon
> whole classes of combinations as soon as they get more expensive than
> the currently best one. That's thanks to additive nature of the
> comparison costing, because appending a column to the sort key can
> only make it more expensive. My guess is this will make this a non-issue.
>
> [1] https://en.wikipedia.org/wiki/Backtracking
>
>>
>> We could suggest an order of pathkeys as patch suggests now and if
>> cost_sort() estimates cost is less than 80% (arbitrary chosen) cost
>> of user-suggested pathkeys then it use our else user pathkeys.
>>
>
> I really despise such arbitrary thresholds. I'd much rather use a more
> reliable heuristics by default, even if it gets it wrong in some cases
> (which it will, but that's natural).
>
> regards
>
Additionally put an upper limit threshold on the number of combinations
to check, fairly large by default?
If first threshold is exceeded, could consider checking out a few more
selected at random from paths not yet checked, to avoid any bias caused
by stopping a systematic search. This might prove important when N! is
fairly large.
Cheers,
Gavin
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2018-06-29 21:27:17 | Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian |
Previous Message | Peter Eisentraut | 2018-06-29 20:47:00 | TLS 1.3 and OpenSSL |