Re: planner chooses incremental but not the best one

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Nicolas Lutic <n(dot)lutic(at)loxodata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: planner chooses incremental but not the best one
Date: 2023-12-18 10:40:06
Message-ID: CAMbWs49jrLzMMoR_WcE+rs1Cw23CsFUUzuUZhf2JAK0dn2S=SQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 18, 2023 at 7:31 AM Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
wrote:

> Oh! Now I see what you meant by using the new formula in 84f9a35e3
> depending on how we sum tuples. I agree that seems like the right thing.
>
> I'm not sure it'll actually help with the issue, though - if I apply the
> patch, the plan does not actually change (and the cost changes just a
> little bit).
>
> I looked at this only very briefly, but I believe it's due to the
> assumption of independence I mentioned earlier - we end up using the new
> formula introduced in 84f9a35e3, but it assumes it assumes the
> selectivity and number of groups are independent. But that'd not the
> case here, because the groups are very clearly correlated (with the
> condition on "b").

You're right. The patch allows us to adjust the estimate of distinct
values for appendrels using the new formula introduced in 84f9a35e3.
But if the restrictions are correlated with the grouping expressions,
the new formula does not behave well. That's why the patch does not
help in case [1], where 'b' and 'c' are correlated.

OTOH, if the restrictions are not correlated with the grouping
expressions, the new formula would perform quite well. And in this case
the patch would help a lot, as shown in [2] where estimate_num_groups()
gives a much more accurate estimate with the help of this patch.

So this patch could be useful in certain situations. I'm wondering if
we should at least have this patch (if it is right).

> If that's the case, I'm not sure how to fix this :-(

The commit message of 84f9a35e3 says

This could possibly be improved upon in the future by identifying
correlated restrictions and using a hybrid of the old and new
formulae.

Maybe this is something we can consider trying. But anyhow this is not
an easy task I suppose.

[1]
https://www.postgresql.org/message-id/CAMbWs4-FtsJ0dQUv6g%3D%3DXR_gsq%3DFj9oiydW6gbqwQ_wrbU0osw%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/CAMbWs4-ocromEKMtVDH3RBMuAJQaQDK0qi4k6zOuvpOnMWZauw%40mail.gmail.com

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-12-18 10:49:33 Re: Remove MSVC scripts from the tree
Previous Message Drouvot, Bertrand 2023-12-18 10:36:25 Re: Move walreceiver state assignment (to WALRCV_STREAMING) in WalReceiverMain()