Incorrect cost for MergeAppend

From: Alexander Kuzmenkov <akuzmenkov(at)timescale(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Incorrect cost for MergeAppend
Date: 2024-01-29 12:40:41
Message-ID: CALzhyqyhoXQDR-Usd_0HeWk=uqNLzoVeT8KhRoo=pV_KzgO3QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

While investigating some query plans, I noticed some code that seems
to be wrong: when create_merge_append_path() estimates the cost of
sorting an input, it calls cost_sort() passing subpath->parent->tuples
as the number of tuples. Shouldn't it use subpath->parent->rows or
even subpath->rows instead? The `tuples` variable doesn't account for
the filters on the relation, so this leads to incorrect cost estimates
when there are highly selective filters, and Sort + Append is chosen
instead of MergeAppend.

I don't have a good repro yet because the plans I've been studying
involve a third-party extension, but the code looks incorrect so I
thought I should ask.

Here is a link to this code on GitHub:
https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/optimizer/util/pathnode.c#L1469-L1477

---
Alexander Kuzmenkov
Timescale

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2024-01-29 12:42:32 Re: Functions to return random numbers in a given range
Previous Message Dean Rasheed 2024-01-29 12:38:11 Re: Functions to return random numbers in a given range