Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?

From: Zhenghua Lyu <zlyu(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?
Date: 2020-10-26 15:01:41
Message-ID: SN6PR05MB4559AB0F8E1F0902F70C62EFB5190@SN6PR05MB4559.namprd05.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,
when group by multi-columns, it will multiply all the distinct values together, and if one column is all null,
it also contributes 200 to the final estimate, and if the product is over the relation size, it will be clamp.

So the the value of the agg rel size is not correct, and impacts the upper path's cost estimate, and do not
give a good plan.

I debug some other queries and find this issue, but not sure if this issue is the root cause of my problem,
just open a thread here for discussion.
________________________________
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Monday, October 26, 2020 10:37 PM
To: Zhenghua Lyu <zlyu(at)vmware(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?

Zhenghua Lyu <zlyu(at)vmware(dot)com> writes:
> It seems the function `get_variable_numdistinct` ignore the case when stanullfrac is 1.0:

I don't like this patch at all. What's the argument for having a special
case for this value? When would we ever get exactly 1.0 in practice?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-10-26 15:02:36 Re: Internal key management system
Previous Message David G. Johnston 2020-10-26 14:53:43 Re: Additional Chapter for Tutorial