Re: Incorrect cost for MergeAppend

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Alexander Kuzmenkov <akuzmenkov(at)timescale(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Incorrect cost for MergeAppend
Date: 2024-01-30 07:20:29
Message-ID: CAExHW5uv2GtZNSRzwdRrg0UEvB0pYet7YWASEE7TOP6rP1TqmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 29, 2024 at 6:11 PM Alexander Kuzmenkov
<akuzmenkov(at)timescale(dot)com> wrote:
>
> Hello hackers,
>
> While investigating some query plans, I noticed some code that seems
> to be wrong: when create_merge_append_path() estimates the cost of
> sorting an input, it calls cost_sort() passing subpath->parent->tuples
> as the number of tuples. Shouldn't it use subpath->parent->rows or
> even subpath->rows instead? The `tuples` variable doesn't account for
> the filters on the relation, so this leads to incorrect cost estimates
> when there are highly selective filters, and Sort + Append is chosen
> instead of MergeAppend.

All other callers of cost_sort() except plan_cluster_use_sort() are
using rows instead of tuples. Even plan_cluster_use_sort() has
rel->rows = rel->tuples, it's actually passing rows. So agree with
your suggestion. However a test will be good since this code is quite
old.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-01-30 07:20:54 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Ashutosh Bapat 2024-01-30 07:12:37 Re: Returning non-terminated string in ECPG Informix-compatible function