| From: | wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
|---|---|
| To: | Richard Guo <guofenglinux(at)gmail(dot)com> |
| Cc: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs |
| Date: | 2026-04-14 07:58:51 |
| Message-ID: | CAGjGUALqgfapfAy+i7mYSq9+dOhg_4sf6C4bUhbM1+-uQqyoPw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Richard
> I don't think your proposed change would work. vardata->rel is the
> CTE/subquery scan rel in the outer query, and its tuples count is the
> CTE's output row count, not the base table's. Using it would be
> equivalent to not converting at all, since get_variable_numdistinct()
> already computes -stadistinct * vardata->rel->tuples. What we need
> here is the base table's rel in the subroot, which gives us the
> correct rowcount for interpreting the negative fraction.
Thank you for your explanation. The path LGTM
Thanks
On Tue, Apr 14, 2026 at 2:11 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> On Mon, Apr 13, 2026 at 12:27 PM wenhui qiu <qiuwenhuifx(at)gmail(dot)com> wrote:
>
> > Thanks so much for working on this! While looking at the negative
> stadistinct conversion, I was wondering if we might run into a potential
> edge case with multi-level nested subqueries. What do you think?
> >
> > /* Convert negative stadistinct to absolute count */
> >
> > if (stats->stadistinct < 0)
> > {
> > - RelOptInfo *baserel = find_base_rel(subroot, var->varno);
> > + RelOptInfo *baserel = vardata->rel;
> >
> > - if (baserel->tuples > 0)
> > + if (baserel && baserel->tuples > 0)
> > {
> > stats->stadistinct = (float4)
> > clamp_row_est(-stats->stadistinct * baserel->tuples);
> > }
> > }
>
> I don't think your proposed change would work. vardata->rel is the
> CTE/subquery scan rel in the outer query, and its tuples count is the
> CTE's output row count, not the base table's. Using it would be
> equivalent to not converting at all, since get_variable_numdistinct()
> already computes -stadistinct * vardata->rel->tuples. What we need
> here is the base table's rel in the subroot, which gives us the
> correct rowcount for interpreting the negative fraction.
>
> - Richard
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2026-04-14 08:08:22 | meson html:alias vs. html:custom |
| Previous Message | Amit Kapila | 2026-04-14 07:51:55 | Re: Support EXCEPT for TABLES IN SCHEMA publications |