Re: Improve UNION's output rowcount estimate

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improve UNION's output rowcount estimate
Date: 2026-06-22 00:39:10
Message-ID: CAApHDvpFFne1ze-h=7fHsg9Tyo4ER9TTX_XU7_Ps-Ti68D7EsA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 20 Jun 2026 at 14:21, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> I noticed that UNION's output rowcount estimate can be very wrong, as
> the planner ignores the duplicate removal and just uses the total
> input size.

I believe this should make the following code redundant, so shouldn't
the patch remove it too?

/*
* Estimate the number of UNION output rows. In the case when only a
* single UNION child remains, we can use estimate_num_groups() on
* that child. We must be careful not to do this when that child is
* the result of some other set operation as the targetlist will
* contain Vars with varno==0, which estimate_num_groups() wouldn't
* like.
*/
if (list_length(cheapest.subpaths) == 1 &&
first_path->parent->reloptkind != RELOPT_UPPER_REL)
{
dNumGroups = estimate_num_groups(root,
first_path->pathtarget->exprs,
first_path->rows,
NULL,
NULL);
}

Then you may as well pass dNumChildGroups directly to the path
creation functions and get rid of your new "With multiple children,"
comment.

Aside from that, I don't see any issues.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2026-06-22 02:16:17 Re: Support EXCEPT for TABLES IN SCHEMA publications
Previous Message Peter Smith 2026-06-21 22:39:21 Re: pg_stat_replication docs incomplete for logical replication